stella-ops.org/git.stella-ops.org

Fork 0

Files

root 68da90a11a

Docs CI / lint-and-preview (push) Has been cancelled

Details

Restructure solution layout by module

2025-10-28 15:10:40 +02:00

36 KiB

Raw Blame History

StellaOps Authority Service

Status: Drafted 2025-10-12 (CORE5B.DOC / DOC1.AUTH) – aligns with Authority revocation store, JWKS rotation, and bootstrap endpoints delivered in Sprint 1.

1. Purpose

The StellaOps Authority service issues OAuth2/OIDC tokens for every StellaOps module (Concelier, Backend, Agent, Zastava) and exposes the policy controls required in sovereign/offline environments. Authority is built as a minimal ASP.NET host that:

brokers password, client-credentials, and device-code flows through pluggable identity providers;
persists access/refresh/device tokens in MongoDB with deterministic schemas for replay analysis and air-gapped audit copies;
distributes revocation bundles and JWKS material so downstream services can enforce lockouts without direct database access;
offers bootstrap APIs for first-run provisioning and key rotation without redeploying binaries.

Authority is deployed alongside Concelier in air-gapped environments and never requires outbound internet access. All trusted metadata (OpenIddict discovery, JWKS, revocation bundles) is cacheable, signed, and reproducible.

2. Component Architecture

Authority is composed of five cooperating subsystems:

Minimal API host – configures OpenIddict endpoints (/token, /authorize, /revoke, /jwks), publishes the OpenAPI contract at /.well-known/openapi, and enables structured logging/telemetry. Rate limiting hooks (AuthorityRateLimiter) wrap every request.
Plugin host – loads StellaOps.Authority.Plugin.*.dll assemblies, applies capability metadata, and exposes password/client provisioning surfaces through dependency injection.
Mongo storage – persists tokens, revocations, bootstrap invites, and plugin state in deterministic collections indexed for offline sync (authority_tokens, authority_revocations, etc.).
Cryptography layer – StellaOps.Cryptography abstractions manage password hashing, signing keys, JWKS export, and detached JWS generation.
Offline ops APIs – internal endpoints under /internal/* provide administrative flows (bootstrap users/clients, revocation export) guarded by API keys and deterministic audit events.

A high-level sequence for password logins:

Client -> /token (password grant)
  -> Rate limiter & audit hooks
  -> Plugin credential store (Argon2id verification)
  -> Token persistence (Mongo authority_tokens)
  -> Response (access/refresh tokens + deterministic claims)

3. Token Lifecycle & Persistence

Authority persists every issued token in MongoDB so operators can audit or revoke without scanning distributed caches.

Collection: authority_tokens
Key fields:
tokenId, type (access_token, refresh_token, device_code, authorization_code)
subjectId, clientId, ordered scope array
tenant (lower-cased tenant hint from the issuing client, omitted for global clients)

Console OIDC client

Client ID: console-web
Grants: authorization_code (PKCE required), refresh_token
Audience: console
Scopes: openid, profile, email, advisory:read, vex:read, aoc:verify, findings:read, orch:read, vuln:read
Redirect URIs (defaults): https://console.stella-ops.local/oidc/callback
Post-logout redirect: https://console.stella-ops.local/
Tokens: Access tokens inherit the global 2 minute lifetime; refresh tokens remain short-lived (30 days) and can be exchanged silently via /token.
Roles: Assign Authority role Orch.Viewer (exposed to tenants as role/orch-viewer) when operators need read-only access to Orchestrator telemetry via Console dashboards. Policy Studio ships dedicated roles (role/policy-author, role/policy-reviewer, role/policy-approver, role/policy-operator, role/policy-auditor) that align with the new policy:* scope family; issue them per tenant so audit trails remain scoped.

Configuration sample (etc/authority.yaml.sample) seeds the client with a confidential secret so Console can negotiate the code exchange on the backend while browsers execute the PKCE dance.

Console Authority endpoints

/console/tenants — Requires authority:tenants.read; returns the tenant catalogue for the authenticated principal. Requests lacking the X-Stella-Tenant header are rejected (tenant_header_missing) and logged.
/console/profile — Requires ui.read; exposes subject metadata (roles, scopes, audiences) and indicates whether the session is within the five-minute fresh-auth window.
/console/token/introspect — Requires ui.read; introspects the active access token so the SPA can prompt for re-authentication before privileged actions.

All endpoints demand DPoP-bound tokens and propagate structured audit events (authority.console.*). Gateways must forward the X-Stella-Tenant header derived from the access token; downstream services rely on the same value for isolation. Keep Console access tokens short-lived (default 15 minutes) and enforce the fresh-auth window for admin actions (ui.admin, authority:*, policy:activate, exceptions:approve).

status (valid, revoked, expired), createdAt, optional expiresAt
revokedAt, machine-readable revokedReason, optional revokedReasonDescription
revokedMetadata (string dictionary for plugin-specific context)
Persistence flow: PersistTokensHandler stamps missing JWT IDs, normalises scopes, and stores every principal emitted by OpenIddict.
Revocation flow: AuthorityTokenStore.UpdateStatusAsync flips status, records the reason metadata, and is invoked by token revocation handlers and plugin provisioning events (e.g., disabling a user).
Expiry maintenance: AuthorityTokenStore.DeleteExpiredAsync prunes non-revoked tokens past their expiresAt timestamp. Operators should schedule this in maintenance windows if large volumes of tokens are issued.

Expectations for resource servers

Resource servers (Concelier WebService, Backend, Agent) must not assume in-memory caches are authoritative. They should:

cache /jwks and /revocations/export responses within configured lifetimes;
honour revokedReason metadata when shaping audit trails;
treat status != "valid" or missing tokens as immediate denial conditions.
propagate the tenant claim (X-Stella-Tenant header in REST calls) and reject requests when the tenant supplied by Authority does not match the resource server's scope; Concelier and Excititor guard endpoints refuse cross-tenant tokens.

Tenant propagation

Client provisioning (bootstrap or plug-in) accepts a tenant hint. Authority normalises the value (trim().ToLowerInvariant()) and persists it alongside the registration. Clients without an explicit tenant remain global.
Issued principals include the stellaops:tenant claim. PersistTokensHandler mirrors this claim into authority_tokens.tenant, enabling per-tenant revocation and reporting.
Rate limiter metadata now tags requests with authority.tenant, unlocking per-tenant throughput metrics and diagnostic filters. Audit events (authority.client_credentials.grant, authority.password.grant, bootstrap flows) surface the tenant and login attempt documents index on {tenant, occurredAt} for quick queries.
Client credentials that request advisory:ingest, advisory:read, vex:ingest, vex:read, signals:read, signals:write, signals:admin, or aoc:verify now fail fast when the client registration lacks a tenant hint. Issued tokens are re-validated against persisted tenant metadata, and Authority rejects any cross-tenant replay (invalid_client/invalid_token), ensuring aggregation-only workloads remain tenant-scoped.
Client credentials that request export.viewer, export.operator, or export.admin must provide a tenant hint. Requests for export.admin also need accompanying export_reason and export_ticket parameters; Authority returns invalid_request when either value is missing and records the denial in token audit events.
Policy Studio scopes (policy:author, policy:review, policy:approve, policy:operate, policy:audit, policy:simulate, policy:run, policy:activate) require a tenant assignment; Authority rejects tokens missing the hint with invalid_client and records scope.invalid metadata for auditing.
AOC pairing guardrails – Tokens that request advisory:read, vex:read, or any signals:* scope must also request aoc:verify. Authority rejects mismatches with invalid_scope (Scope 'aoc:verify' is required when requesting advisory/vex read scopes. or Scope 'aoc:verify' is required when requesting signals scopes.) so automation surfaces deterministic errors.
Signals ingestion guardrails – Sensors and services requesting signals:write/signals:admin must also request aoc:verify; Authority records the authority.aoc_scope_violation tag when the pairing is missing so operators can trace failing sensors immediately.
Password grant flows reuse the client registration's tenant and enforce the configured scope allow-list. Requested scopes outside that list (or mismatched tenants) trigger invalid_scope/invalid_client failures, ensuring cross-tenant access is denied before token issuance.

Default service scopes

Client ID	Purpose	Scopes granted	Sender constraint	Tenant
`concelier-ingest`	Concelier raw advisory ingestion	`advisory:ingest`, `advisory:read`	`dpop`	`tenant-default`
`excitor-ingest`	Excititor raw VEX ingestion	`vex:ingest`, `vex:read`	`dpop`	`tenant-default`
`aoc-verifier`	Aggregation-only contract verification	`aoc:verify`, `advisory:read`, `vex:read`	`dpop`	`tenant-default`
`cartographer-service`	Graph snapshot construction	`graph:write`, `graph:read`	`dpop`	`tenant-default`
`graph-api`	Graph Explorer gateway/API	`graph:read`, `graph:export`, `graph:simulate`	`dpop`	`tenant-default`
`export-center-operator`	Export Center operator automation	`export.viewer`, `export.operator`	`dpop`	`tenant-default`
`export-center-admin`	Export Center administrative automation	`export.viewer`, `export.operator`, `export.admin`	`dpop`	`tenant-default`
`vuln-explorer-ui`	Vuln Explorer UI/API	`vuln:read`	`dpop`	`tenant-default`
`signals-uploader`	Reachability sensor ingestion	`signals:write`, `signals:read`, `aoc:verify`	`dpop`	`tenant-default`

Secret hygiene (2025‑10‑27): The repository includes a convenience etc/authority.yaml for compose/helm smoke tests. Every entry’s secretFile points to etc/secrets/*.secret, which ship with *-change-me placeholders—replace them with strong values (and wire them through your vault/secret manager) before issuing tokens in CI, staging, or production.

For factory provisioning, issue sensors the SignalsUploader role template (signals:write, signals:read, aoc:verify). Authority rejects ingestion tokens that omit aoc:verify, preserving aggregation-only contract guarantees for reachability signals.

These registrations are provided as examples in etc/authority.yaml.sample. Clone them per tenant (for example concelier-tenant-a, concelier-tenant-b) so tokens remain tenant-scoped by construction.

Graph Explorer introduces dedicated scopes: graph:write for Cartographer build jobs, graph:read for query/read operations, graph:export for long-running export downloads, and graph:simulate for what-if overlays. Assign only the scopes a client actually needs to preserve least privilege—UI-facing clients should typically request read/export access, while background services (Cartographer, Scheduler) require write privileges.

Least-privilege guidance for graph clients

Service identities – The Cartographer worker should request graph:write and graph:read only; grant graph:simulate exclusively to pipeline automation that invokes Policy Engine overlays on demand. Keep graph:export scoped to API gateway components responsible for streaming GraphML/JSONL artifacts. Authority enforces this by rejecting graph:write tokens that lack properties.serviceIdentity: cartographer.
Tenant propagation – Every client registration must pin a tenant hint. Authority normalises the value and stamps it into issued tokens (stellaops:tenant) so downstream services (Scheduler, Graph API, Console) can enforce tenant isolation without custom headers. Graph scopes (graph:read, graph:write, graph:export, graph:simulate) are denied if the tenant hint is missing.
SDK alignment – Use the generated StellaOpsScopes constants in service code to request graph scopes. Hard-coded strings risk falling out of sync as additional graph capabilities are added.
DPOP for automation – Maintain sender-constrained (dpop) flows for Cartographer and Scheduler to limit reuse of access tokens if a build host is compromised. For UI-facing tokens, pair graph:read/graph:export with short lifetimes and enforce refresh-token rotation at the gateway.

Export Center scope guardrails

Viewer vs operator – export.viewer grants read-only access to export profiles, manifests, and bundles. Automation that schedules or reruns exports should request export.operator (and typically export.viewer). Tenant hints remain mandatory; Authority refuses tokens without them.
Administrative mutations – Changes to retention policies, encryption key references, or schedule defaults require export.admin. When requesting tokens with this scope, clients must supply export_reason and export_ticket parameters; Authority persists the values for audit records and rejects missing metadata with invalid_request.
Operational hygiene – Rotate export.admin credentials infrequently and run them through fresh-auth workflows where possible. Prefer distributing verification tooling with export.viewer tokens for day-to-day bundle validation.

Vuln Explorer permalinks

Scope – vuln:read authorises Vuln Explorer to fetch advisory/linkset evidence and issue shareable links. Assign it only to front-end/API clients that must render vulnerability details.
Signed links – POST /permalinks/vuln (requires vuln:read) accepts { "tenant": "tenant-a", "resourceKind": "vulnerability", "state": { ... }, "expiresInSeconds": 86400 } and returns a JWT (token) plus issuedAt/expiresAt. The token embeds the tenant, requested state, and vuln:read scope and is signed with the same Authority signing keys published via /jwks.
Validation – Resource servers verify the permalink using cached JWKS: check signature, ensure the tenant matches the current request context, honour the expiry, and enforce the contained vuln:read scope. The payload’s resource.state block is opaque JSON so UIs can round-trip filters/search terms without new schema changes.

4. Revocation Pipeline

Authority centralises revocation in authority_revocations with deterministic categories:

Category	Meaning	Required fields
`token`	Specific OAuth token revoked early.	`revocationId` (token id), `tokenType`, optional `clientId`, `subjectId`
`subject`	All tokens for a subject disabled.	`revocationId` (= subject id)
`client`	OAuth client registration revoked.	`revocationId` (= client id)
`key`	Signing/JWE key withdrawn.	`revocationId` (= key id)

RevocationBundleBuilder flattens Mongo documents into canonical JSON, sorts entries by (category, revocationId, revokedAt), and signs exports using detached JWS (RFC 7797) with cosign-compatible headers.

Export surfaces (deterministic output, suitable for Offline Kit):

CLI: stella auth revoke export --output ./out writes revocation-bundle.json, .jws, .sha256.
Verification: stella auth revoke verify --bundle <path> --signature <path> --key <path> validates detached JWS signatures before distribution, selecting the crypto provider advertised in the detached header (see docs/security/revocation-bundle.md).
API: GET /internal/revocations/export (requires bootstrap API key) returns the same payload.
Verification: stella auth revoke verify validates schema, digest, and detached JWS using cached JWKS or offline keys, automatically preferring the hinted provider (libsodium builds honour provider=libsodium; other builds fall back to the managed provider).

Consumer guidance:

Mirror revocation-bundle.json* alongside Concelier exports. Offline agents fetch both over the existing update channel.
Use bundle sequence and bundleId to detect replay or monotonicity regressions. Ignore bundles with older sequence numbers unless bundleId changes and issuedAt advances.
Treat revokedReason taxonomy as machine-friendly codes (compromised, rotation, policy, lifecycle). Translating to human-readable logs is the consumer’s responsibility.

5. Signing Keys & JWKS Rotation

Authority signs revocation bundles and publishes JWKS entries via the new signing manager:

Configuration (authority.yaml):

signing:
  enabled: true
  algorithm: ES256            # Defaults to ES256
  keySource: file             # Loader identifier (file, vault, etc.)
  provider: default           # Optional preferred crypto provider
  activeKeyId: authority-signing-dev
  keyPath: "../certificates/authority-signing-dev.pem"
  additionalKeys:
    - keyId: authority-signing-dev-2024
      path: "../certificates/authority-signing-dev-2024.pem"
      source: "file"

Sources: The default loader supports PEM files relative to the content root; additional loaders can be registered via IAuthoritySigningKeySource.
Providers: Keys are registered against the ICryptoProviderRegistry, so alternative implementations (HSM, libsodium) can be plugged in without changing host code.
OpenAPI discovery: GET /.well-known/openapi returns the published authentication contract (JSON by default, YAML when requested). Responses include X-StellaOps-Service, X-StellaOps-Api-Version, X-StellaOps-Build-Version, plus grant and scope headers, and honour conditional requests via ETag/If-None-Match.
JWKS output: GET /jwks lists every signing key with status metadata (active, retired). Old keys remain until operators remove them from configuration, allowing verification of historical bundles/tokens.

Rotation SOP (no downtime)

Generate a new P-256 private key (PEM) on an offline workstation and place it where the Authority host can read it (e.g., ../certificates/authority-signing-2025.pem).

Call the authenticated admin API:

curl -sS -X POST https://authority.example.com/internal/signing/rotate \
  -H "x-stellaops-bootstrap-key: ${BOOTSTRAP_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
        "keyId": "authority-signing-2025",
        "location": "../certificates/authority-signing-2025.pem",
        "source": "file"
      }'

Verify the response reports the previous key as retired and fetch /jwks to confirm the new kid appears with status: "active".
Persist the old key path in signing.additionalKeys (the rotation API updates in-memory options; rewrite the YAML to match so restarts remain consistent).
If you prefer automation, trigger the .gitea/workflows/authority-key-rotation.yml workflow with the new keyId/keyPath; it wraps ops/authority/key-rotation.sh and reads environment-specific secrets. The older key will be marked retired and appended to signing.additionalKeys.
Re-run stella auth revoke export so revocation bundles are signed with the new key. Downstream caches should refresh JWKS within their configured lifetime (StellaOpsAuthorityOptions.Signing + client cache tolerance).

The rotation API leverages the same cryptography abstractions as revocation signing; no restart is required and the previous key is marked retired but kept available for verification.

6. Bootstrap & Administrative Endpoints

Administrative APIs live under /internal/* and require the bootstrap API key plus rate-limiter compliance.

Endpoint	Method	Description
`/internal/users`	`POST`	Provision initial administrative accounts through the registered password-capable plug-in. Emits structured audit events.
`/internal/clients`	`POST`	Provision OAuth clients (client credentials / device code).
`/internal/revocations/export`	`GET`	Export revocation bundle + detached JWS + digest.
`/internal/signing/rotate`	`POST`	Promote a new signing key (see SOP above). Request body accepts `keyId`, `location`, optional `source`, `algorithm`, `provider`, and metadata.

All administrative calls emit AuthEventRecord entries enriched with correlation IDs, PII tags, and network metadata for offline SOC ingestion.

Tenant hint: include a tenant entry inside properties when bootstrapping clients. Authority normalises the value, stores it on the registration, and stamps future tokens/audit events with the tenant.

Bootstrap client example

POST /internal/clients
{
  "clientId": "concelier",
  "confidential": true,
  "displayName": "Concelier Backend",
  "allowedGrantTypes": ["client_credentials"],
  "allowedScopes": ["concelier.jobs.trigger", "advisory:ingest", "advisory:read"],
  "properties": {
    "tenant": "tenant-default"
  }
}

For environments with multiple tenants, repeat the call per tenant-specific client (e.g. concelier-tenant-a, concelier-tenant-b) or append suffixes to the client identifier.

Aggregation-only verification tokens

Issue a dedicated client (e.g. aoc-verifier) with the scopes aoc:verify, advisory:read, and vex:read for each tenant that runs guard checks. Authority refuses to mint tokens for these scopes unless the client registration provides a tenant hint.
The CLI (stella aoc verify --tenant <tenant>) and Console verification panel both call /aoc/verify on Concelier and Excititor. Tokens that omit the tenant claim or present a tenant that does not match the stored registration are rejected with invalid_client/invalid_token.
Audit: authority.client_credentials.grant entries record scope.invalid="aoc:verify" when requests are rejected because the tenant hint is missing or mismatched.

Exception approvals & routing

New scopes exceptions:read, exceptions:write, and exceptions:approve govern access to the exception lifecycle. Map these via tenant roles (exceptions-service, exceptions-approver) as described in /docs/security/authority-scopes.md.
Configure approval routing in authority.yaml with declarative templates. Each template exposes an authorityRouteId for downstream services (Policy Engine, Console) and an optional requireMfa flag:

exceptions:
  routingTemplates:
    - id: "secops"
      authorityRouteId: "approvals/secops"
      requireMfa: true
      description: "Security Operations approval chain"
    - id: "governance"
      authorityRouteId: "approvals/governance"
      requireMfa: false
      description: "Non-production waiver review"

Clients requesting exception scopes must include a tenant assignment. Authority rejects client-credential flows that request exceptions:* with invalid_client and logs scope.invalid="exceptions:write" (or the requested scope) in authority.client_credentials.grant audit events when the tenant hint is missing.
When any configured routing template sets requireMfa: true, user-facing tokens that contain exceptions:approve must be acquired through an MFA-capable identity provider. Password/OIDC flows that lack MFA support are rejected with authority.password.grant audit events where reason="Exception approval scope requires an MFA-capable identity provider."
Update interactive clients (Console) to request exceptions:read by default and elevate to exceptions:approve only inside fresh-auth workflows for approvers. Documented examples live in etc/authority.yaml.sample.
Verification responses map guard failures to ERR_AOC_00x codes and Authority emits authority.client_credentials.grant + authority.token.validate_access audit records containing the tenant and scopes so operators can trace who executed a run.
For air-gapped or offline replicas, pre-issue verification tokens per tenant and rotate them alongside ingest credentials; the guard endpoints never mutate data and remain safe to expose through the offline kit schedule.

7. Configuration Reference

Section	Key	Description	Notes
Root	`issuer`	Absolute HTTPS issuer advertised to clients.	Required. Loopback HTTP allowed only for development.
Tokens	`accessTokenLifetime`, `refreshTokenLifetime`, etc.	Lifetimes for each grant (access, refresh, device, authorization code, identity).	Enforced during issuance; persisted on each token document.
Storage	`storage.connectionString`	MongoDB connection string.	Required even for tests; offline kits ship snapshots for seeding.
Signing	`signing.enabled`	Enable JWKS/revocation signing.	Disable only for development.
Signing	`signing.algorithm`	Signing algorithm identifier.	Currently ES256; additional curves can be wired through crypto providers.
Signing	`signing.keySource`	Loader identifier (`file`, `vault`, custom).	Determines which `IAuthoritySigningKeySource` resolves keys.
Signing	`signing.keyPath`	Relative/absolute path understood by the loader.	Stored as-is; rotation request should keep it in sync with filesystem layout.
Signing	`signing.activeKeyId`	Active JWKS / revocation signing key id.	Exposed as `kid` in JWKS and bundles.
Signing	`signing.additionalKeys[].keyId`	Retired key identifier retained for verification.	Manager updates this automatically after rotation; keep YAML aligned.
Signing	`signing.additionalKeys[].source`	Loader identifier per retired key.	Defaults to `signing.keySource` if omitted.
Security	`security.rateLimiting`	Fixed-window limits for `/token`, `/authorize`, `/internal/*`.	See `docs/security/rate-limits.md` for tuning.
Bootstrap	`bootstrap.apiKey`	Shared secret required for `/internal/*`.	Only required when `bootstrap.enabled` is true.

7.1 Sender-constrained clients (DPoP & mTLS)

Authority now understands two flavours of sender-constrained OAuth clients:

DPoP proof-of-possession – clients sign a DPoP header for /token requests. Authority validates the JWK thumbprint, HTTP method/URI, and replay window, then stamps the resulting access token with cnf.jkt so downstream services can verify the same key is reused.
- Configure under security.senderConstraints.dpop. allowedAlgorithms, proofLifetime, and replayWindow are enforced at validation time.
- security.senderConstraints.dpop.nonce.enabled enables nonce challenges for high-value audiences (requiredAudiences, normalised to case-insensitive strings). When a nonce is required but missing or expired, /token replies with WWW-Authenticate: DPoP error="use_dpop_nonce" (and, when available, a fresh DPoP-Nonce header). Clients must retry with the issued nonce embedded in the proof.
- security.senderConstraints.dpop.nonce.store selects memory (default) or redis. When redis is configured, set security.senderConstraints.dpop.nonce.redisConnectionString so replicas share nonce issuance and high-value clients avoid replay gaps during failover.
- Example (enabling Redis-backed nonces; adjust audiences per deployment):
```
security:
  senderConstraints:
    dpop:
      enabled: true
      proofLifetime: "00:02:00"
      replayWindow: "00:05:00"
      allowedAlgorithms: [ "ES256", "ES384" ]
      nonce:
        enabled: true
        ttl: "00:10:00"
        maxIssuancePerMinute: 120
        store: "redis"
        redisConnectionString: "redis://authority-redis:6379?ssl=false"
        requiredAudiences:
          - "signer"
          - "attestor"
```
  Operators can override any field via environment variables (e.g. STELLAOPS_AUTHORITY__SECURITY__SENDERCONSTRAINTS__DPOP__NONCE__STORE=redis).
- Declare client audiences in bootstrap manifests or plug-in provisioning metadata; Authority now defaults the token aud claim and resource indicator from this list, which is also used to trigger nonce enforcement for audiences such as signer and attestor.
Mutual TLS clients – client registrations may declare an mTLS binding (senderConstraint: mtls). When enabled via security.senderConstraints.mtls, Authority validates the presented client certificate against stored bindings (certificateBindings[]), optional chain verification, and timing windows. Successful requests embed cnf.x5t#S256 into the access token (and introspection output) so resource servers can enforce the certificate thumbprint.
- security.senderConstraints.mtls.enforceForAudiences forces mTLS whenever the requested aud/resource (or the client's configured audiences) intersect the configured allow-list (default includes signer). Clients configured for different sender constraints are rejected early so operator policy remains consistent.
- Certificate bindings now act as an allow-list: Authority verifies thumbprint, subject, issuer, serial number, and any declared SAN values against the presented certificate, with rotation grace windows applied to notBefore/notAfter. Operators can enforce subject regexes, SAN type allow-lists (dns, uri, ip), trusted certificate authorities, and rotation grace via security.senderConstraints.mtls.*.

Both modes persist additional metadata in authority_tokens: senderConstraint records the enforced policy, while senderKeyThumbprint stores the DPoP JWK thumbprint or mTLS certificate hash captured at issuance. Downstream services can rely on these fields (and the corresponding cnf claim) when auditing offline copies of the token store.

7.2 Policy Engine clients & scopes

Policy Engine v2 introduces dedicated scopes and a service identity that materialises effective findings. Configure Authority as follows when provisioning policy clients:

Client	Scopes	Notes
`policy-engine` (service)	`policy:run`, `findings:read`, `effective:write`	Must include `properties.serviceIdentity: policy-engine` and a tenant. Authority rejects `effective:write` tokens without the marker or tenant.
`policy-cli` / automation	`policy:read`, `policy:author`, `policy:review`, `policy:simulate`, `findings:read` (optionally add `policy:approve` / `policy:operate` / `policy:activate` for promotion pipelines)	Keep scopes minimal; reroll CLI/CI tokens issued before 2025‑10‑27 so they drop legacy scope names and adopt the new set.
UI/editor sessions	`policy:read`, `policy:author`, `policy:simulate` (+ reviewer/approver/operator scopes as appropriate)	Issue tenant-specific clients so audit and rate limits remain scoped.

Sample YAML entry:

  - clientId: "policy-engine"
    displayName: "Policy Engine Service"
    grantTypes: [ "client_credentials" ]
    audiences: [ "api://policy-engine" ]
    scopes: [ "policy:run", "findings:read", "effective:write" ]
    tenant: "tenant-default"
    properties:
      serviceIdentity: "policy-engine"
    senderConstraint: "dpop"
    auth:
      type: "client_secret"
      secretFile: "../secrets/policy-engine.secret"

Compliance checklist:

policy-engine client includes properties.serviceIdentity: policy-engine and a tenant hint; logins missing either are rejected.
Non-service clients omit effective:write and receive only the scopes required for their role (policy:read, policy:author, policy:review, policy:approve, policy:operate, policy:simulate, etc.).
Legacy tokens using policy:write/policy:submit/policy:edit are rotated to the new scope set before Production change freeze (see release migration note below).
Approval/activation workflows use identities distinct from authoring identities; tenants are provisioned per client to keep telemetry segregated.
Operators document reviewer assignments and incident procedures alongside /docs/security/policy-governance.md and archive policy evidence bundles (stella policy bundle export) with each release.

7.3 Orchestrator roles & scopes

Role / Client	Scopes	Notes
`Orch.Viewer` role	`orch:read`	Read-only access to Orchestrator dashboards, queues, and telemetry.
`Orch.Operator` role	`orch:read`, `orch:operate`	Issue short-lived tokens for control actions (pause/resume, retry, sync). Token requests must include `operator_reason` (≤256 chars) and `operator_ticket` (≤128 chars); Authority rejects requests missing either value and records both in audit events.

Token request example via client credentials:

curl -u orch-operator:s3cr3t! \
  -d 'grant_type=client_credentials' \
  -d 'scope=orch:operate' \
  -d 'operator_reason=resume source after maintenance' \
  -d 'operator_ticket=INC-2045' \
  https://authority.example.com/token

Tokens lacking operator_reason or operator_ticket receive invalid_request; audit events (authority.client_credentials.grant) surface the supplied values under request.reason and request.ticket for downstream review. CLI clients set these parameters via Authority.OperatorReason / Authority.OperatorTicket (environment variables STELLAOPS_ORCH_REASON and STELLAOPS_ORCH_TICKET).

8. Offline & Sovereign Operation

No outbound dependencies: Authority only contacts MongoDB and local plugins. Discovery and JWKS are cached by clients with offline tolerances (AllowOfflineCacheFallback, OfflineCacheTolerance). Operators should mirror these responses for air-gapped use.
Structured logging: Every revocation export, signing rotation, bootstrap action, and token issuance emits structured logs with traceId, client_id, subjectId, and network.remoteIp where applicable. Mirror logs to your SIEM to retain audit trails without central connectivity.
Determinism: Sorting rules in token and revocation exports guarantee byte-for-byte identical artefacts given the same datastore state. Hashes and signatures remain stable across machines.

9. Operational Checklist

Protect the bootstrap API key and disable bootstrap endpoints (bootstrap.enabled: false) once initial setup is complete.
Schedule stella auth revoke export (or /internal/revocations/export) at the same cadence as Concelier exports so bundles remain in lockstep.
Rotate signing keys before expiration; keep at least one retired key until all cached bundles/tokens signed with it have expired.
Monitor /health and /ready plus rate-limiter metrics to detect plugin outages early.
Ensure downstream services cache JWKS and revocation bundles within tolerances; stale caches risk accepting revoked tokens.

For plug-in specific requirements, refer to Authority Plug-in Developer Guide. For revocation bundle validation workflow, see Authority Revocation Bundle.

36 KiB Raw Blame History Unescape Escape