- Implemented PolicyDslValidator with command-line options for strict mode and JSON output. - Created PolicySchemaExporter to generate JSON schemas for policy-related models. - Developed PolicySimulationSmoke tool to validate policy simulations against expected outcomes. - Added project files and necessary dependencies for each tool. - Ensured proper error handling and usage instructions across tools.
25 KiB
StellaOps Authority Service
Status: Drafted 2025-10-12 (CORE5B.DOC / DOC1.AUTH) – aligns with Authority revocation store, JWKS rotation, and bootstrap endpoints delivered in Sprint 1.
1. Purpose
The StellaOps Authority service issues OAuth2/OIDC tokens for every StellaOps module (Concelier, Backend, Agent, Zastava) and exposes the policy controls required in sovereign/offline environments. Authority is built as a minimal ASP.NET host that:
- brokers password, client-credentials, and device-code flows through pluggable identity providers;
- persists access/refresh/device tokens in MongoDB with deterministic schemas for replay analysis and air-gapped audit copies;
- distributes revocation bundles and JWKS material so downstream services can enforce lockouts without direct database access;
- offers bootstrap APIs for first-run provisioning and key rotation without redeploying binaries.
Authority is deployed alongside Concelier in air-gapped environments and never requires outbound internet access. All trusted metadata (OpenIddict discovery, JWKS, revocation bundles) is cacheable, signed, and reproducible.
2. Component Architecture
Authority is composed of five cooperating subsystems:
- Minimal API host – configures OpenIddict endpoints (
/token,/authorize,/revoke,/jwks) and structured logging/telemetry. Rate limiting hooks (AuthorityRateLimiter) wrap every request. - Plugin host – loads
StellaOps.Authority.Plugin.*.dllassemblies, applies capability metadata, and exposes password/client provisioning surfaces through dependency injection. - Mongo storage – persists tokens, revocations, bootstrap invites, and plugin state in deterministic collections indexed for offline sync (
authority_tokens,authority_revocations, etc.). - Cryptography layer –
StellaOps.Cryptographyabstractions manage password hashing, signing keys, JWKS export, and detached JWS generation. - Offline ops APIs – internal endpoints under
/internal/*provide administrative flows (bootstrap users/clients, revocation export) guarded by API keys and deterministic audit events.
A high-level sequence for password logins:
Client -> /token (password grant)
-> Rate limiter & audit hooks
-> Plugin credential store (Argon2id verification)
-> Token persistence (Mongo authority_tokens)
-> Response (access/refresh tokens + deterministic claims)
3. Token Lifecycle & Persistence
Authority persists every issued token in MongoDB so operators can audit or revoke without scanning distributed caches.
- Collection:
authority_tokens - Key fields:
tokenId,type(access_token,refresh_token,device_code,authorization_code)subjectId,clientId, orderedscopearraytenant(lower-cased tenant hint from the issuing client, omitted for global clients)status(valid,revoked,expired),createdAt, optionalexpiresAtrevokedAt, machine-readablerevokedReason, optionalrevokedReasonDescriptionrevokedMetadata(string dictionary for plugin-specific context)- Persistence flow:
PersistTokensHandlerstamps missing JWT IDs, normalises scopes, and stores every principal emitted by OpenIddict. - Revocation flow:
AuthorityTokenStore.UpdateStatusAsyncflips status, records the reason metadata, and is invoked by token revocation handlers and plugin provisioning events (e.g., disabling a user). - Expiry maintenance:
AuthorityTokenStore.DeleteExpiredAsyncprunes non-revoked tokens past theirexpiresAttimestamp. Operators should schedule this in maintenance windows if large volumes of tokens are issued.
Expectations for resource servers
Resource servers (Concelier WebService, Backend, Agent) must not assume in-memory caches are authoritative. They should:
- cache
/jwksand/revocations/exportresponses within configured lifetimes; - honour
revokedReasonmetadata when shaping audit trails; - treat
status != "valid"or missing tokens as immediate denial conditions.
Tenant propagation
- Client provisioning (bootstrap or plug-in) accepts a
tenanthint. Authority normalises the value (trim().ToLowerInvariant()) and persists it alongside the registration. Clients without an explicit tenant remain global. - Issued principals include the
stellaops:tenantclaim.PersistTokensHandlermirrors this claim intoauthority_tokens.tenant, enabling per-tenant revocation and reporting. - Rate limiter metadata now tags requests with
authority.tenant, unlocking per-tenant throughput metrics and diagnostic filters. Audit events (authority.client_credentials.grant,authority.password.grant, bootstrap flows) surface the tenant and login attempt documents index on{tenant, occurredAt}for quick queries. - Password grant flows reuse the client registration's tenant and enforce the configured scope allow-list. Requested scopes outside that list (or mismatched tenants) trigger
invalid_scope/invalid_clientfailures, ensuring cross-tenant access is denied before token issuance.
Default service scopes
| Client ID | Purpose | Scopes granted | Sender constraint | Tenant |
|---|---|---|---|---|
concelier-ingest |
Concelier raw advisory ingestion | advisory:ingest, advisory:read |
dpop |
tenant-default |
excitor-ingest |
Excititor raw VEX ingestion | vex:ingest, vex:read |
dpop |
tenant-default |
aoc-verifier |
Aggregation-only contract verification | aoc:verify |
dpop |
tenant-default |
cartographer-service |
Graph snapshot construction | graph:write, graph:read |
dpop |
tenant-default |
graph-api |
Graph Explorer gateway/API | graph:read, graph:export, graph:simulate |
dpop |
tenant-default |
vuln-explorer-ui |
Vuln Explorer UI/API | vuln:read |
dpop |
tenant-default |
Secret hygiene (2025‑10‑27): The repository includes a convenience
etc/authority.yamlfor compose/helm smoke tests. Every entry’ssecretFilepoints toetc/secrets/*.secret, which ship with*-change-meplaceholders—replace them with strong values (and wire them through your vault/secret manager) before issuing tokens in CI, staging, or production.
These registrations are provided as examples in etc/authority.yaml.sample. Clone them per tenant (for example concelier-tenant-a, concelier-tenant-b) so tokens remain tenant-scoped by construction.
Graph Explorer introduces dedicated scopes: graph:write for Cartographer build jobs, graph:read for query/read operations, graph:export for long-running export downloads, and graph:simulate for what-if overlays. Assign only the scopes a client actually needs to preserve least privilege—UI-facing clients should typically request read/export access, while background services (Cartographer, Scheduler) require write privileges.
Least-privilege guidance for graph clients
- Service identities – The Cartographer worker should request
graph:writeandgraph:readonly; grantgraph:simulateexclusively to pipeline automation that invokes Policy Engine overlays on demand. Keepgraph:exportscoped to API gateway components responsible for streaming GraphML/JSONL artifacts. Authority enforces this by rejectinggraph:writetokens that lackproperties.serviceIdentity: cartographer. - Tenant propagation – Every client registration must pin a
tenanthint. Authority normalises the value and stamps it into issued tokens (stellaops:tenant) so downstream services (Scheduler, Graph API, Console) can enforce tenant isolation without custom headers. Graph scopes (graph:read,graph:write,graph:export,graph:simulate) are denied if the tenant hint is missing. - SDK alignment – Use the generated
StellaOpsScopesconstants in service code to request graph scopes. Hard-coded strings risk falling out of sync as additional graph capabilities are added. - DPOP for automation – Maintain sender-constrained (
dpop) flows for Cartographer and Scheduler to limit reuse of access tokens if a build host is compromised. For UI-facing tokens, pairgraph:read/graph:exportwith short lifetimes and enforce refresh-token rotation at the gateway.
Vuln Explorer permalinks
- Scope –
vuln:readauthorises Vuln Explorer to fetch advisory/linkset evidence and issue shareable links. Assign it only to front-end/API clients that must render vulnerability details. - Signed links –
POST /permalinks/vuln(requiresvuln:read) accepts{ "tenant": "tenant-a", "resourceKind": "vulnerability", "state": { ... }, "expiresInSeconds": 86400 }and returns a JWT (token) plusissuedAt/expiresAt. The token embeds the tenant, requested state, andvuln:readscope and is signed with the same Authority signing keys published via/jwks. - Validation – Resource servers verify the permalink using cached JWKS: check signature, ensure the tenant matches the current request context, honour the expiry, and enforce the contained
vuln:readscope. The payload’sresource.stateblock is opaque JSON so UIs can round-trip filters/search terms without new schema changes.
4. Revocation Pipeline
Authority centralises revocation in authority_revocations with deterministic categories:
| Category | Meaning | Required fields |
|---|---|---|
token |
Specific OAuth token revoked early. | revocationId (token id), tokenType, optional clientId, subjectId |
subject |
All tokens for a subject disabled. | revocationId (= subject id) |
client |
OAuth client registration revoked. | revocationId (= client id) |
key |
Signing/JWE key withdrawn. | revocationId (= key id) |
RevocationBundleBuilder flattens Mongo documents into canonical JSON, sorts entries by (category, revocationId, revokedAt), and signs exports using detached JWS (RFC 7797) with cosign-compatible headers.
Export surfaces (deterministic output, suitable for Offline Kit):
- CLI:
stella auth revoke export --output ./outwritesrevocation-bundle.json,.jws,.sha256. - Verification:
stella auth revoke verify --bundle <path> --signature <path> --key <path>validates detached JWS signatures before distribution, selecting the crypto provider advertised in the detached header (seedocs/security/revocation-bundle.md). - API:
GET /internal/revocations/export(requires bootstrap API key) returns the same payload. - Verification:
stella auth revoke verifyvalidates schema, digest, and detached JWS using cached JWKS or offline keys, automatically preferring the hinted provider (libsodium builds honourprovider=libsodium; other builds fall back to the managed provider).
Consumer guidance:
- Mirror
revocation-bundle.json*alongside Concelier exports. Offline agents fetch both over the existing update channel. - Use bundle
sequenceandbundleIdto detect replay or monotonicity regressions. Ignore bundles with older sequence numbers unlessbundleIdchanges andissuedAtadvances. - Treat
revokedReasontaxonomy as machine-friendly codes (compromised,rotation,policy,lifecycle). Translating to human-readable logs is the consumer’s responsibility.
5. Signing Keys & JWKS Rotation
Authority signs revocation bundles and publishes JWKS entries via the new signing manager:
- Configuration (
authority.yaml):signing: enabled: true algorithm: ES256 # Defaults to ES256 keySource: file # Loader identifier (file, vault, etc.) provider: default # Optional preferred crypto provider activeKeyId: authority-signing-dev keyPath: "../certificates/authority-signing-dev.pem" additionalKeys: - keyId: authority-signing-dev-2024 path: "../certificates/authority-signing-dev-2024.pem" source: "file" - Sources: The default loader supports PEM files relative to the content root; additional loaders can be registered via
IAuthoritySigningKeySource. - Providers: Keys are registered against the
ICryptoProviderRegistry, so alternative implementations (HSM, libsodium) can be plugged in without changing host code. - JWKS output:
GET /jwkslists every signing key withstatusmetadata (active,retired). Old keys remain until operators remove them from configuration, allowing verification of historical bundles/tokens.
Rotation SOP (no downtime)
- Generate a new P-256 private key (PEM) on an offline workstation and place it where the Authority host can read it (e.g.,
../certificates/authority-signing-2025.pem). - Call the authenticated admin API:
curl -sS -X POST https://authority.example.com/internal/signing/rotate \ -H "x-stellaops-bootstrap-key: ${BOOTSTRAP_KEY}" \ -H "Content-Type: application/json" \ -d '{ "keyId": "authority-signing-2025", "location": "../certificates/authority-signing-2025.pem", "source": "file" }' - Verify the response reports the previous key as retired and fetch
/jwksto confirm the newkidappears withstatus: "active". - Persist the old key path in
signing.additionalKeys(the rotation API updates in-memory options; rewrite the YAML to match so restarts remain consistent). - If you prefer automation, trigger the
.gitea/workflows/authority-key-rotation.ymlworkflow with the newkeyId/keyPath; it wrapsops/authority/key-rotation.shand reads environment-specific secrets. The older key will be markedretiredand appended tosigning.additionalKeys. - Re-run
stella auth revoke exportso revocation bundles are signed with the new key. Downstream caches should refresh JWKS within their configured lifetime (StellaOpsAuthorityOptions.Signing+ client cache tolerance).
The rotation API leverages the same cryptography abstractions as revocation signing; no restart is required and the previous key is marked retired but kept available for verification.
6. Bootstrap & Administrative Endpoints
Administrative APIs live under /internal/* and require the bootstrap API key plus rate-limiter compliance.
| Endpoint | Method | Description |
|---|---|---|
/internal/users |
POST |
Provision initial administrative accounts through the registered password-capable plug-in. Emits structured audit events. |
/internal/clients |
POST |
Provision OAuth clients (client credentials / device code). |
/internal/revocations/export |
GET |
Export revocation bundle + detached JWS + digest. |
/internal/signing/rotate |
POST |
Promote a new signing key (see SOP above). Request body accepts keyId, location, optional source, algorithm, provider, and metadata. |
All administrative calls emit AuthEventRecord entries enriched with correlation IDs, PII tags, and network metadata for offline SOC ingestion.
Tenant hint: include a
tenantentry insidepropertieswhen bootstrapping clients. Authority normalises the value, stores it on the registration, and stamps future tokens/audit events with the tenant.
Bootstrap client example
POST /internal/clients
{
"clientId": "concelier",
"confidential": true,
"displayName": "Concelier Backend",
"allowedGrantTypes": ["client_credentials"],
"allowedScopes": ["concelier.jobs.trigger", "advisory:ingest", "advisory:read"],
"properties": {
"tenant": "tenant-default"
}
}
For environments with multiple tenants, repeat the call per tenant-specific client (e.g. concelier-tenant-a, concelier-tenant-b) or append suffixes to the client identifier.
7. Configuration Reference
| Section | Key | Description | Notes |
|---|---|---|---|
| Root | issuer |
Absolute HTTPS issuer advertised to clients. | Required. Loopback HTTP allowed only for development. |
| Tokens | accessTokenLifetime, refreshTokenLifetime, etc. |
Lifetimes for each grant (access, refresh, device, authorization code, identity). | Enforced during issuance; persisted on each token document. |
| Storage | storage.connectionString |
MongoDB connection string. | Required even for tests; offline kits ship snapshots for seeding. |
| Signing | signing.enabled |
Enable JWKS/revocation signing. | Disable only for development. |
| Signing | signing.algorithm |
Signing algorithm identifier. | Currently ES256; additional curves can be wired through crypto providers. |
| Signing | signing.keySource |
Loader identifier (file, vault, custom). |
Determines which IAuthoritySigningKeySource resolves keys. |
| Signing | signing.keyPath |
Relative/absolute path understood by the loader. | Stored as-is; rotation request should keep it in sync with filesystem layout. |
| Signing | signing.activeKeyId |
Active JWKS / revocation signing key id. | Exposed as kid in JWKS and bundles. |
| Signing | signing.additionalKeys[].keyId |
Retired key identifier retained for verification. | Manager updates this automatically after rotation; keep YAML aligned. |
| Signing | signing.additionalKeys[].source |
Loader identifier per retired key. | Defaults to signing.keySource if omitted. |
| Security | security.rateLimiting |
Fixed-window limits for /token, /authorize, /internal/*. |
See docs/security/rate-limits.md for tuning. |
| Bootstrap | bootstrap.apiKey |
Shared secret required for /internal/*. |
Only required when bootstrap.enabled is true. |
7.1 Sender-constrained clients (DPoP & mTLS)
Authority now understands two flavours of sender-constrained OAuth clients:
- DPoP proof-of-possession – clients sign a
DPoPheader for/tokenrequests. Authority validates the JWK thumbprint, HTTP method/URI, and replay window, then stamps the resulting access token withcnf.jktso downstream services can verify the same key is reused.- Configure under
security.senderConstraints.dpop.allowedAlgorithms,proofLifetime, andreplayWindoware enforced at validation time. security.senderConstraints.dpop.nonce.enabledenables nonce challenges for high-value audiences (requiredAudiences, normalised to case-insensitive strings). When a nonce is required but missing or expired,/tokenreplies withWWW-Authenticate: DPoP error="use_dpop_nonce"(and, when available, a freshDPoP-Nonceheader). Clients must retry with the issued nonce embedded in the proof.security.senderConstraints.dpop.nonce.storeselectsmemory(default) orredis. Whenredisis configured, setsecurity.senderConstraints.dpop.nonce.redisConnectionStringso replicas share nonce issuance and high-value clients avoid replay gaps during failover.- Example (enabling Redis-backed nonces; adjust audiences per deployment):
Operators can override any field via environment variables (e.g.
security: senderConstraints: dpop: enabled: true proofLifetime: "00:02:00" replayWindow: "00:05:00" allowedAlgorithms: [ "ES256", "ES384" ] nonce: enabled: true ttl: "00:10:00" maxIssuancePerMinute: 120 store: "redis" redisConnectionString: "redis://authority-redis:6379?ssl=false" requiredAudiences: - "signer" - "attestor"STELLAOPS_AUTHORITY__SECURITY__SENDERCONSTRAINTS__DPOP__NONCE__STORE=redis). - Declare client
audiencesin bootstrap manifests or plug-in provisioning metadata; Authority now defaults the tokenaudclaim andresourceindicator from this list, which is also used to trigger nonce enforcement for audiences such assignerandattestor.
- Configure under
- Mutual TLS clients – client registrations may declare an mTLS binding (
senderConstraint: mtls). When enabled viasecurity.senderConstraints.mtls, Authority validates the presented client certificate against stored bindings (certificateBindings[]), optional chain verification, and timing windows. Successful requests embedcnf.x5t#S256into the access token (and introspection output) so resource servers can enforce the certificate thumbprint.security.senderConstraints.mtls.enforceForAudiencesforces mTLS whenever the requestedaud/resource(or the client's configured audiences) intersect the configured allow-list (default includessigner). Clients configured for different sender constraints are rejected early so operator policy remains consistent.- Certificate bindings now act as an allow-list: Authority verifies thumbprint, subject, issuer, serial number, and any declared SAN values against the presented certificate, with rotation grace windows applied to
notBefore/notAfter. Operators can enforce subject regexes, SAN type allow-lists (dns,uri,ip), trusted certificate authorities, and rotation grace viasecurity.senderConstraints.mtls.*.
Both modes persist additional metadata in authority_tokens: senderConstraint records the enforced policy, while senderKeyThumbprint stores the DPoP JWK thumbprint or mTLS certificate hash captured at issuance. Downstream services can rely on these fields (and the corresponding cnf claim) when auditing offline copies of the token store.
7.2 Policy Engine clients & scopes
Policy Engine v2 introduces dedicated scopes and a service identity that materialises effective findings. Configure Authority as follows when provisioning policy clients:
| Client | Scopes | Notes |
|---|---|---|
policy-engine (service) |
policy:run, findings:read, effective:write |
Must include properties.serviceIdentity: policy-engine and a tenant. Authority rejects effective:write tokens without the marker or tenant. |
policy-cli / automation |
policy:write, policy:submit, policy:run, findings:read |
Keep scopes minimal; only trusted automation should add policy:approve/policy:activate. |
| UI/editor sessions | policy:read, policy:write, policy:simulate (+ reviewer/approver scopes as appropriate) |
Issue tenant-specific clients so audit and rate limits remain scoped. |
Sample YAML entry:
- clientId: "policy-engine"
displayName: "Policy Engine Service"
grantTypes: [ "client_credentials" ]
audiences: [ "api://policy-engine" ]
scopes: [ "policy:run", "findings:read", "effective:write" ]
tenant: "tenant-default"
properties:
serviceIdentity: "policy-engine"
senderConstraint: "dpop"
auth:
type: "client_secret"
secretFile: "../secrets/policy-engine.secret"
Compliance checklist:
policy-engineclient includesproperties.serviceIdentity: policy-engineand a tenant hint; logins missing either are rejected.- Non-service clients omit
effective:writeand receive only the scopes required for their role (policy:write,policy:submit,policy:approve,policy:activate, etc.). - Approval/activation workflows use identities distinct from authoring identities; tenants are provisioned per client to keep telemetry segregated.
- Operators document reviewer assignments and incident procedures alongside
/docs/security/policy-governance.mdand archive policy evidence bundles (stella policy bundle export) with each release.
8. Offline & Sovereign Operation
- No outbound dependencies: Authority only contacts MongoDB and local plugins. Discovery and JWKS are cached by clients with offline tolerances (
AllowOfflineCacheFallback,OfflineCacheTolerance). Operators should mirror these responses for air-gapped use. - Structured logging: Every revocation export, signing rotation, bootstrap action, and token issuance emits structured logs with
traceId,client_id,subjectId, andnetwork.remoteIpwhere applicable. Mirror logs to your SIEM to retain audit trails without central connectivity. - Determinism: Sorting rules in token and revocation exports guarantee byte-for-byte identical artefacts given the same datastore state. Hashes and signatures remain stable across machines.
9. Operational Checklist
- Protect the bootstrap API key and disable bootstrap endpoints (
bootstrap.enabled: false) once initial setup is complete. - Schedule
stella auth revoke export(or/internal/revocations/export) at the same cadence as Concelier exports so bundles remain in lockstep. - Rotate signing keys before expiration; keep at least one retired key until all cached bundles/tokens signed with it have expired.
- Monitor
/healthand/readyplus rate-limiter metrics to detect plugin outages early. - Ensure downstream services cache JWKS and revocation bundles within tolerances; stale caches risk accepting revoked tokens.
For plug-in specific requirements, refer to Authority Plug-in Developer Guide. For revocation bundle validation workflow, see Authority Revocation Bundle.