- Added Program.cs to set up the web application with Serilog for logging, health check endpoints, and a placeholder admission endpoint. - Configured Kestrel server to use TLS 1.3 and handle client certificates appropriately. - Created StellaOps.Zastava.Webhook.csproj with necessary dependencies including Serilog and Polly. - Documented tasks in TASKS.md for the Zastava Webhook project, outlining current work and exit criteria for each task.
16 KiB
component_architecture_authority.md — Stella Ops Authority (2025Q4)
Scope. Implementation‑ready architecture for Stella Ops Authority: the on‑prem OIDC/OAuth2 service that issues short‑lived, sender‑constrained operational tokens (OpToks) to first‑party services and tools. Covers protocols (DPoP & mTLS binding), token shapes, endpoints, storage, rotation, HA, RBAC, audit, and testing. This component is the trust anchor for who is calling inside a Stella Ops installation. (Entitlement is proven separately by PoE from the cloud Licensing Service; Authority does not issue PoE.)
0) Mission & boundaries
Mission. Provide fast, local, verifiable authentication for Stella Ops microservices and tools by minting very short‑lived OAuth2/OIDC tokens that are sender‑constrained (DPoP or mTLS‑bound). Support RBAC scopes, multi‑tenant claims, and deterministic validation for APIs (Scanner, Signer, Attestor, Excititor, Concelier, UI, CLI, Zastava).
Boundaries.
- Authority does not validate entitlements/licensing. That’s enforced by Signer using PoE with the cloud Licensing Service.
- Authority tokens are operational only (2–5 min TTL) and must not be embedded in long‑lived artifacts or stored in SBOMs.
- Authority is stateless for validation (JWT) and optional introspection for services that prefer online checks.
1) Protocols & cryptography
-
OIDC Discovery:
/.well-known/openid-configuration -
OAuth2 grant types:
- Client Credentials (service↔service, with mTLS or private_key_jwt)
- Device Code (CLI login on headless agents; optional)
- Authorization Code + PKCE (browser login for UI; optional)
-
Sender constraint options (choose per caller or per audience):
- DPoP (Demonstration of Proof‑of‑Possession): proof JWT on each HTTP request, bound to the access token via
cnf.jkt. - OAuth 2.0 mTLS (certificate‑bound tokens): token bound to client certificate thumbprint via
cnf.x5t#S256.
- DPoP (Demonstration of Proof‑of‑Possession): proof JWT on each HTTP request, bound to the access token via
-
Signing algorithms: EdDSA (Ed25519) preferred; fallback ES256 (P‑256). Rotation is supported via kid in JWKS.
-
Token format: JWT access tokens (compact), optionally opaque reference tokens for services that insist on introspection.
-
Clock skew tolerance: ±60 s; issue
nbf,iat,expaccordingly.
2) Token model
2.1 Access token (OpTok) — short‑lived (120–300 s)
Registered claims
iss = https://authority.<domain>
sub = <client_id or user_id>
aud = <service audience: signer|scanner|attestor|concelier|excititor|ui|zastava>
exp = <unix ts> (<= 300 s from iat)
iat = <unix ts>
nbf = iat - 30
jti = <uuid>
scope = "scanner.scan scanner.export signer.sign ..."
Sender‑constraint (cnf)
-
DPoP:
"cnf": { "jkt": "<base64url(SHA-256(JWK))>" } -
mTLS:
"cnf": { "x5t#S256": "<base64url(SHA-256(client_cert_der))>" }
Install/tenant context (custom claims)
tid = <tenant id> // multi-tenant
inst = <installation id> // unique installation
roles = [ "svc.scanner", "svc.signer", "ui.admin", ... ]
plan? = <plan name> // optional hint for UIs; not used for enforcement
Note
: Do not copy PoE claims into OpTok; OpTok ≠ entitlement. Only Signer checks PoE.
2.2 Refresh tokens (optional)
- Default disabled. If enabled (for UI interactive logins), pair with DPoP‑bound refresh tokens or mTLS client sessions; short TTL (≤ 8 h), rotating on use (replay‑safe).
2.3 ID tokens (optional)
- Issued for UI/browser OIDC flows (Authorization Code + PKCE); not used for service auth.
3) Endpoints & flows
3.1 OIDC discovery & keys
GET /.well-known/openid-configuration→ endpoints, algs, jwks_uriGET /jwks→ JSON Web Key Set (rotating, at least 2 active keys during transition)
3.2 Token issuance
-
POST /oauth/token-
Client Credentials (service→service):
- mTLS: mutual TLS +
client_id→ bound token (cnf.x5t#S256) - private_key_jwt: JWT‑based client auth + DPoP header (preferred for tools and CLI)
- mTLS: mutual TLS +
-
Device Code (CLI):
POST /oauth/device/code+POST /oauth/tokenpoll -
Authorization Code + PKCE (UI): standard
-
DPoP handshake (example)
-
Client prepares JWK (ephemeral keypair).
-
Client sends DPoP proof header with fields:
htm=POST htu=https://authority.../oauth/token iat=<now> jti=<uuid>signed with the DPoP private key; header carries JWK.
-
Authority validates proof; issues access token with
cnf.jkt=<thumbprint(JWK)>. -
Client uses the same DPoP key to sign every subsequent API request to services (Signer, Scanner, …).
mTLS flow
- Mutual TLS at the connection; Authority extracts client cert, validates chain; token carries
cnf.x5t#S256.
3.3 Introspection & revocation (optional)
POST /oauth/introspect→{ active, sub, scope, aud, exp, cnf, ... }POST /oauth/revoke→ revokes refresh tokens or opaque access tokens.- Replay prevention: maintain DPoP
jticache (TTL ≤ 10 min) to reject duplicate proofs when services supply DPoP nonces (Signer requires nonce for high‑value operations).
3.4 UserInfo (optional for UI)
GET /userinfo(ID token context).
4) Audiences, scopes & RBAC
4.1 Audiences
signer— only the Signer service should accept tokens withaud=signer.attestor,scanner,concelier,excititor,ui,zastavasimilarly.
Services must verify aud and sender constraint (DPoP/mTLS) per their policy.
4.2 Core scopes
| Scope | Service | Operation |
|---|---|---|
signer.sign |
Signer | Request DSSE signing |
attestor.write |
Attestor | Submit Rekor entries |
scanner.scan |
Scanner.WebService | Submit scan jobs |
scanner.export |
Scanner.WebService | Export SBOMs |
scanner.read |
Scanner.WebService | Read catalog/SBOMs |
vex.read / vex.admin |
Excititor | Query/operate |
concelier.read / concelier.export |
Concelier | Query/exports |
ui.read / ui.admin |
UI | View/admin |
zastava.emit / zastava.enforce |
Scanner/Zastava | Runtime events / admission |
Roles → scopes mapping is configured centrally (Authority policy) and pushed during token issuance.
5) Storage & state
-
Configuration DB (PostgreSQL/MySQL): clients, audiences, role→scope maps, tenant/installation registry, device code grants, persistent consents (if any).
-
Cache (Redis):
- DPoP jti replay cache (short TTL)
- Nonce store (per resource server, if they demand nonce)
- Device code pollers, rate limiting buckets
-
JWKS: key material in HSM/KMS or encrypted at rest; JWKS served from memory.
6) Key management & rotation
- Maintain at least 2 signing keys active during rotation; tokens carry
kid. - Prefer Ed25519 for compact tokens; maintain ES256 fallback for FIPS contexts.
- Rotation cadence: 30–90 days; emergency rotation supported.
- Publish new JWKS before issuing tokens with the new
kidto avoid cold‑start validation misses. - Keep old keys available at least for max token TTL + 5 minutes.
7) HA & performance
-
Stateless issuance (except device codes/refresh) → scale horizontally behind a load‑balancer.
-
DB only for client metadata and optional flows; token checks are JWT‑local; introspection endpoints hit cache/DB minimally.
-
Targets:
- Token issuance P95 ≤ 20 ms under warm cache.
- DPoP proof validation ≤ 1 ms extra per request at resource servers (Signer/Scanner).
- 99.9% uptime; HPA on CPU/latency.
8) Security posture
- Strict TLS (1.3 preferred); HSTS; modern cipher suites.
- mTLS enabled where required (Signer/Attestor paths).
- Replay protection: DPoP
jticache, nonce support for Signer (addDPoP-Nonceheader on 401; clients re‑sign). - Rate limits per client & per IP; exponential backoff on failures.
- Secrets: clients use private_key_jwt or mTLS; never basic secrets over the wire.
- CSP/CSRF hardening on UI flows;
SameSite=Laxcookies; PKCE enforced. - Logs redact
Authorizationand DPoP proofs; storesub,aud,scopes,inst,tid,cnfthumbprints, not full keys.
9) Multi‑tenancy & installations
- Tenant (
tid) and Installation (inst) registries define which audiences/scopes a client can request. - Cross‑tenant isolation enforced at issuance (disallow rogue
aud), and resource servers must check thattidmatches their configured tenant.
10) Admin & operations APIs
All under /admin (mTLS + authority.admin scope).
POST /admin/clients # create/update client (confidential/public)
POST /admin/audiences # register audience resource URIs
POST /admin/roles # define role→scope mappings
POST /admin/tenants # create tenant/install entries
POST /admin/keys/rotate # rotate signing key (zero-downtime)
GET /admin/metrics # Prometheus exposition (token issue rates, errors)
GET /admin/healthz|readyz # health/readiness
11) Integration hard lines (what resource servers must enforce)
Every Stella Ops service that consumes Authority tokens must:
-
Verify JWT signature (
kidin JWKS),iss,aud,exp,nbf. -
Enforce sender‑constraint:
- DPoP: validate DPoP proof (
htu,htm,iat,jti) and matchcnf.jkt; cachejtifor replay defense; honor nonce challenges. - mTLS: match presented client cert thumbprint to token
cnf.x5t#S256.
- DPoP: validate DPoP proof (
-
Check scopes; optionally map to internal roles.
-
Check tenant (
tid) and installation (inst) as appropriate. -
For Signer only: require both OpTok and PoE in the request (enforced by Signer, not Authority).
12) Error surfaces & UX
- Token endpoint errors follow OAuth2 (
invalid_client,invalid_grant,invalid_scope,unauthorized_client). - Resource servers use RFC 6750 style (
WWW-Authenticate: DPoP error="invalid_token", error_description="…", dpop_nonce="…"). - For DPoP nonce challenges, clients retry with the server‑supplied nonce once.
13) Observability & audit
-
Metrics:
authority.tokens_issued_total{grant,aud}authority.dpop_validations_total{result}authority.mtls_bindings_total{result}authority.jwks_rotations_totalauthority.errors_total{type}
-
Audit log (immutable sink): token issuance (
sub,aud,scopes,tid,inst,cnf thumbprint,jti), revocations, admin changes. -
Tracing: token flows, DB reads, JWKS cache.
14) Configuration (YAML)
authority:
issuer: "https://authority.internal"
keys:
algs: [ "EdDSA", "ES256" ]
rotationDays: 60
storage: kms://cluster-kms/authority-signing
tokens:
accessTtlSeconds: 180
enableRefreshTokens: false
clockSkewSeconds: 60
dpop:
enable: true
nonce:
enable: true
ttlSeconds: 600
mtls:
enable: true
caBundleFile: /etc/ssl/mtls/clients-ca.pem
clients:
- clientId: scanner-web
grantTypes: [ "client_credentials" ]
audiences: [ "scanner" ]
auth: { type: "private_key_jwt", jwkFile: "/secrets/scanner-web.jwk" }
senderConstraint: "dpop"
scopes: [ "scanner.scan", "scanner.export", "scanner.read" ]
- clientId: signer
grantTypes: [ "client_credentials" ]
audiences: [ "signer" ]
auth: { type: "mtls" }
senderConstraint: "mtls"
scopes: [ "signer.sign" ]
- clientId: notify-web-dev
grantTypes: [ "client_credentials" ]
audiences: [ "notify.dev" ]
auth: { type: "client_secret", secretFile: "/secrets/notify-web-dev.secret" }
senderConstraint: "dpop"
scopes: [ "notify.read", "notify.admin" ]
- clientId: notify-web
grantTypes: [ "client_credentials" ]
audiences: [ "notify" ]
auth: { type: "client_secret", secretFile: "/secrets/notify-web.secret" }
senderConstraint: "dpop"
scopes: [ "notify.read", "notify.admin" ]
15) Testing matrix
- JWT validation: wrong
aud, expiredexp, skewednbf, stalekid. - DPoP: invalid
htu/htm, replayedjti, staleiat, wrongjkt, nonce dance. - mTLS: wrong client cert, wrong CA, thumbprint mismatch.
- RBAC: scope enforcement per audience; over‑privileged client denied.
- Rotation: JWKS rotation while load‑testing; zero‑downtime verification.
- HA: kill one Authority instance; verify issuance continues; JWKS served by peers.
- Performance: 1k token issuance/sec on 2 cores with Redis enabled for jti caching.
16) Threat model & mitigations (summary)
| Threat | Vector | Mitigation |
|---|---|---|
| Token theft | Copy of JWT | Short TTL, sender‑constraint (DPoP/mTLS); replay blocked by jti cache and nonces |
| Replay across hosts | Reuse DPoP proof | Enforce htu/htm, iat freshness, jti uniqueness; services may require nonce |
| Impersonation | Fake client | mTLS or private_key_jwt with pinned JWK; client registration & rotation |
| Key compromise | Signing key leak | HSM/KMS storage, key rotation, audit; emergency key revoke path; narrow token TTL |
| Cross‑tenant abuse | Scope elevation | Enforce aud, tid, inst at issuance and resource servers |
| Downgrade to bearer | Strip DPoP | Resource servers require DPoP/mTLS based on aud; reject bearer without cnf |
17) Deployment & HA
- Stateless microservice, containerized; run ≥ 2 replicas behind LB.
- DB: HA Postgres (or MySQL) for clients/roles; Redis for device codes, DPoP nonces/jtis.
- Secrets: mount client JWKs via K8s Secrets/HashiCorp Vault; signing keys via KMS.
- Backups: DB daily; Redis not critical (ephemeral).
- Disaster recovery: export/import of client registry; JWKS rehydrate from KMS.
- Compliance: TLS audit; penetration testing for OIDC flows.
18) Implementation notes
- Reference stack: .NET 10 + OpenIddict 6 (or IdentityServer if licensed) with custom DPoP validator and mTLS binding middleware.
- Keep the DPoP/JTI cache pluggable; allow Redis/Memcached.
- Provide client SDKs for C# and Go: DPoP key mgmt, proof generation, nonce handling, token refresh helper.
19) Quick reference — wire examples
Access token (payload excerpt)
{
"iss": "https://authority.internal",
"sub": "scanner-web",
"aud": "signer",
"exp": 1760668800,
"iat": 1760668620,
"nbf": 1760668620,
"jti": "9d9c3f01-6e1a-49f1-8f77-9b7e6f7e3c50",
"scope": "signer.sign",
"tid": "tenant-01",
"inst": "install-7A2B",
"cnf": { "jkt": "KcVb2V...base64url..." }
}
DPoP proof header fields (for POST /sign/dsse)
{
"htu": "https://signer.internal/sign/dsse",
"htm": "POST",
"iat": 1760668620,
"jti": "4b1c9b3c-8a95-4c58-8a92-9c6cfb4a6a0b"
}
Signer validates that hash(JWK) in the proof matches cnf.jkt in the token.
20) Rollout plan
- MVP: Client Credentials (private_key_jwt + DPoP), JWKS, short OpToks, per‑audience scopes.
- Add: mTLS‑bound tokens for Signer/Attestor; device code for CLI; optional introspection.
- Hardening: DPoP nonce support; full audit pipeline; HA tuning.
- UX: Tenant/installation admin UI; role→scope editors; client bootstrap wizards.