Files

master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.

2025-10-30 00:09:39 +02:00

18 KiB

Raw Blame History

component_architecture_authority.md — Stella Ops Authority (2025Q4)

Consolidates identity and tenancy requirements documented across the AOC, Policy, and Platform guides, along with the dedicated Authority implementation plan.

Scope. Implementation‑ready architecture for Stella Ops Authority: the on‑prem OIDC/OAuth2 service that issues short‑lived, sender‑constrained operational tokens (OpToks) to first‑party services and tools. Covers protocols (DPoP & mTLS binding), token shapes, endpoints, storage, rotation, HA, RBAC, audit, and testing. This component is the trust anchor for who is calling inside a Stella Ops installation. (Entitlement is proven separately by PoE from the cloud Licensing Service; Authority does not issue PoE.)

0) Mission & boundaries

Mission. Provide fast, local, verifiable authentication for Stella Ops microservices and tools by minting very short‑lived OAuth2/OIDC tokens that are sender‑constrained (DPoP or mTLS‑bound). Support RBAC scopes, multi‑tenant claims, and deterministic validation for APIs (Scanner, Signer, Attestor, Excititor, Concelier, UI, CLI, Zastava).

Boundaries.

Authority does not validate entitlements/licensing. That’s enforced by Signer using PoE with the cloud Licensing Service.
Authority tokens are operational only (2–5 min TTL) and must not be embedded in long‑lived artifacts or stored in SBOMs.
Authority is stateless for validation (JWT) and optional introspection for services that prefer online checks.

1) Protocols & cryptography

OIDC Discovery: /.well-known/openid-configuration
OAuth2 grant types:
- Client Credentials (service↔service, with mTLS or private_key_jwt)
- Device Code (CLI login on headless agents; optional)
- Authorization Code + PKCE (browser login for UI; optional)
Sender constraint options (choose per caller or per audience):
- DPoP (Demonstration of Proof‑of‑Possession): proof JWT on each HTTP request, bound to the access token via cnf.jkt.
- OAuth 2.0 mTLS (certificate‑bound tokens): token bound to client certificate thumbprint via cnf.x5t#S256.
Signing algorithms: EdDSA (Ed25519) preferred; fallback ES256 (P‑256). Rotation is supported via kid in JWKS.
Token format: JWT access tokens (compact), optionally opaque reference tokens for services that insist on introspection.
Clock skew tolerance: ±60 s; issue nbf, iat, exp accordingly.

2) Token model

2.1 Access token (OpTok) — short‑lived (120–300 s)

Registered claims

iss   = https://authority.<domain>
sub   = <client_id or user_id>
aud   = <service audience: signer|scanner|attestor|concelier|excititor|ui|zastava>
exp   = <unix ts>  (<= 300 s from iat)
iat   = <unix ts>
nbf   = iat - 30
jti   = <uuid>
scope = "scanner.scan scanner.export signer.sign ..."

Sender‑constraint (cnf)

DPoP:

"cnf": { "jkt": "<base64url(SHA-256(JWK))>" }

mTLS:

"cnf": { "x5t#S256": "<base64url(SHA-256(client_cert_der))>" }

Install/tenant context (custom claims)

tid          = <tenant id>               // multi-tenant
inst         = <installation id>        // unique installation
roles        = [ "svc.scanner", "svc.signer", "ui.admin", ... ]
plan?        = <plan name>              // optional hint for UIs; not used for enforcement

Note

: Do not copy PoE claims into OpTok; OpTok ≠ entitlement. Only Signer checks PoE.

2.2 Refresh tokens (optional)

Default disabled. If enabled (for UI interactive logins), pair with DPoP‑bound refresh tokens or mTLS client sessions; short TTL (≤ 8 h), rotating on use (replay‑safe).

2.3 ID tokens (optional)

Issued for UI/browser OIDC flows (Authorization Code + PKCE); not used for service auth.

3) Endpoints & flows

3.1 OIDC discovery & keys

GET /.well-known/openid-configuration → endpoints, algs, jwks_uri
GET /jwks → JSON Web Key Set (rotating, at least 2 active keys during transition)

3.2 Token issuance

POST /oauth/token
- Client Credentials (service→service):
  - mTLS: mutual TLS + client_id → bound token (cnf.x5t#S256)
    - security.senderConstraints.mtls.enforceForAudiences forces the mTLS path when requested aud/resource values intersect high-value audiences (defaults include signer). Authority rejects clients attempting to use DPoP/basic secrets for these audiences.
    - Stored certificateBindings are authoritative: thumbprint, subject, issuer, serial number, and SAN values are matched against the presented certificate, with rotation grace applied to activation windows. Failures surface deterministic error codes (e.g. certificate_binding_subject_mismatch).
  - private_key_jwt: JWT‑based client auth + DPoP header (preferred for tools and CLI)
- Device Code (CLI): POST /oauth/device/code + POST /oauth/token poll
- Authorization Code + PKCE (UI): standard

DPoP handshake (example)

Client prepares JWK (ephemeral keypair).
Client sends DPoP proof header with fields:
```
htm=POST
htu=https://authority.../oauth/token
iat=<now>
jti=<uuid>
```
signed with the DPoP private key; header carries JWK.
Authority validates proof; issues access token with cnf.jkt=<thumbprint(JWK)>.
Client uses the same DPoP key to sign every subsequent API request to services (Signer, Scanner, …).

mTLS flow

Mutual TLS at the connection; Authority extracts client cert, validates chain; token carries cnf.x5t#S256.

3.3 Introspection & revocation (optional)

POST /oauth/introspect → { active, sub, scope, aud, exp, cnf, ... }
POST /oauth/revoke → revokes refresh tokens or opaque access tokens.
Replay prevention: maintain DPoP jti cache (TTL ≤ 10 min) to reject duplicate proofs when services supply DPoP nonces (Signer requires nonce for high‑value operations).

3.4 UserInfo (optional for UI)

GET /userinfo (ID token context).

4) Audiences, scopes & RBAC

4.1 Audiences

signer — only the Signer service should accept tokens with aud=signer.
attestor, scanner, concelier, excititor, ui, zastava similarly.

Services must verify aud and sender constraint (DPoP/mTLS) per their policy.

4.2 Core scopes

Scope	Service	Operation
`signer.sign`	Signer	Request DSSE signing
`attestor.write`	Attestor	Submit Rekor entries
`scanner.scan`	Scanner.WebService	Submit scan jobs
`scanner.export`	Scanner.WebService	Export SBOMs
`scanner.read`	Scanner.WebService	Read catalog/SBOMs
`vex.read` / `vex.admin`	Excititor	Query/operate
`concelier.read` / `concelier.export`	Concelier	Query/exports
`ui.read` / `ui.admin`	UI	View/admin
`zastava.emit` / `zastava.enforce`	Scanner/Zastava	Runtime events / admission

Roles → scopes mapping is configured centrally (Authority policy) and pushed during token issuance.

5) Storage & state

Configuration DB (PostgreSQL/MySQL): clients, audiences, role→scope maps, tenant/installation registry, device code grants, persistent consents (if any).
Cache (Redis):
- DPoP jti replay cache (short TTL)
- Nonce store (per resource server, if they demand nonce)
- Device code pollers, rate limiting buckets
JWKS: key material in HSM/KMS or encrypted at rest; JWKS served from memory.

6) Key management & rotation

Maintain at least 2 signing keys active during rotation; tokens carry kid.
Prefer Ed25519 for compact tokens; maintain ES256 fallback for FIPS contexts.
Rotation cadence: 30–90 days; emergency rotation supported.
Publish new JWKS before issuing tokens with the new kid to avoid cold‑start validation misses.
Keep old keys available at least for max token TTL + 5 minutes.

7) HA & performance

Stateless issuance (except device codes/refresh) → scale horizontally behind a load‑balancer.
DB only for client metadata and optional flows; token checks are JWT‑local; introspection endpoints hit cache/DB minimally.
Targets:
- Token issuance P95 ≤ 20 ms under warm cache.
- DPoP proof validation ≤ 1 ms extra per request at resource servers (Signer/Scanner).
- 99.9% uptime; HPA on CPU/latency.

8) Security posture

Strict TLS (1.3 preferred); HSTS; modern cipher suites.
mTLS enabled where required (Signer/Attestor paths).
Replay protection: DPoP jti cache, nonce support for Signer (add DPoP-Nonce header on 401; clients re‑sign).
Rate limits per client & per IP; exponential backoff on failures.
Secrets: clients use private_key_jwt or mTLS; never basic secrets over the wire.
CSP/CSRF hardening on UI flows; SameSite=Lax cookies; PKCE enforced.
Logs redact Authorization and DPoP proofs; store sub, aud, scopes, inst, tid, cnf thumbprints, not full keys.

9) Multi‑tenancy & installations

Tenant (tid) and Installation (inst) registries define which audiences/scopes a client can request.
Cross‑tenant isolation enforced at issuance (disallow rogue aud), and resource servers must check that tid matches their configured tenant.

10) Admin & operations APIs

All under /admin (mTLS + authority.admin scope).

POST /admin/clients                 # create/update client (confidential/public)
POST /admin/audiences               # register audience resource URIs
POST /admin/roles                   # define role→scope mappings
POST /admin/tenants                 # create tenant/install entries
POST /admin/keys/rotate             # rotate signing key (zero-downtime)
GET  /admin/metrics                 # Prometheus exposition (token issue rates, errors)
GET  /admin/healthz|readyz          # health/readiness

Declared client audiences flow through to the issued JWT aud claim and the token request's resource indicators. Authority relies on this metadata to enforce DPoP nonce challenges for signer, attestor, and other high-value services without requiring clients to repeat the audience parameter on every request.

11) Integration hard lines (what resource servers must enforce)

Every Stella Ops service that consumes Authority tokens must:

Verify JWT signature (kid in JWKS), iss, aud, exp, nbf.
Enforce sender‑constraint:
- DPoP: validate DPoP proof (htu, htm, iat, jti) and match cnf.jkt; cache jti for replay defense; honor nonce challenges.
- mTLS: match presented client cert thumbprint to token cnf.x5t#S256.
Check scopes; optionally map to internal roles.
Check tenant (tid) and installation (inst) as appropriate.
For Signer only: require both OpTok and PoE in the request (enforced by Signer, not Authority).

12) Error surfaces & UX

Token endpoint errors follow OAuth2 (invalid_client, invalid_grant, invalid_scope, unauthorized_client).
Resource servers use RFC 6750 style (WWW-Authenticate: DPoP error="invalid_token", error_description="…", dpop_nonce="…" ).
For DPoP nonce challenges, clients retry with the server‑supplied nonce once.

13) Observability & audit

Metrics:
- authority.tokens_issued_total{grant,aud}
- authority.dpop_validations_total{result}
- authority.mtls_bindings_total{result}
- authority.jwks_rotations_total
- authority.errors_total{type}
Audit log (immutable sink): token issuance (sub, aud, scopes, tid, inst, cnf thumbprint, jti), revocations, admin changes.
Tracing: token flows, DB reads, JWKS cache.

14) Configuration (YAML)

authority:
  issuer: "https://authority.internal"
  signing:
    enabled: true
    activeKeyId: "authority-signing-2025"
    keyPath: "../certificates/authority-signing-2025.pem"
    algorithm: "ES256"
    keySource: "file"
  security:
    rateLimiting:
      token:
        enabled: true
        permitLimit: 30
        window: "00:01:00"
        queueLimit: 0
      authorize:
        enabled: true
        permitLimit: 60
        window: "00:01:00"
        queueLimit: 10
      internal:
        enabled: false
        permitLimit: 5
        window: "00:01:00"
        queueLimit: 0
    senderConstraints:
      dpop:
        enabled: true
        allowedAlgorithms: [ "ES256", "ES384" ]
        proofLifetime: "00:02:00"
        allowedClockSkew: "00:00:30"
        replayWindow: "00:05:00"
        nonce:
          enabled: true
          ttl: "00:10:00"
          maxIssuancePerMinute: 120
          store: "redis"
          redisConnectionString: "redis://authority-redis:6379?ssl=false"
          requiredAudiences:
            - "signer"
            - "attestor"
      mtls:
        enabled: true
        requireChainValidation: true
        rotationGrace: "00:15:00"
        enforceForAudiences:
          - "signer"
        allowedSanTypes:
          - "dns"
          - "uri"
        allowedCertificateAuthorities:
          - "/etc/ssl/mtls/clients-ca.pem"
  clients:
    - clientId: scanner-web
      grantTypes: [ "client_credentials" ]
      audiences: [ "scanner" ]
      auth: { type: "private_key_jwt", jwkFile: "/secrets/scanner-web.jwk" }
      senderConstraint: "dpop"
      scopes: [ "scanner.scan", "scanner.export", "scanner.read" ]
    - clientId: signer
      grantTypes: [ "client_credentials" ]
      audiences: [ "signer" ]
      auth: { type: "mtls" }
      senderConstraint: "mtls"
      scopes: [ "signer.sign" ]
    - clientId: notify-web-dev
      grantTypes: [ "client_credentials" ]
      audiences: [ "notify.dev" ]
      auth: { type: "client_secret", secretFile: "/secrets/notify-web-dev.secret" }
      senderConstraint: "dpop"
      scopes: [ "notify.read", "notify.admin" ]
    - clientId: notify-web
      grantTypes: [ "client_credentials" ]
      audiences: [ "notify" ]
      auth: { type: "client_secret", secretFile: "/secrets/notify-web.secret" }
      senderConstraint: "dpop"
      scopes: [ "notify.read", "notify.admin" ]

15) Testing matrix

JWT validation: wrong aud, expired exp, skewed nbf, stale kid.
DPoP: invalid htu/htm, replayed jti, stale iat, wrong jkt, nonce dance.
mTLS: wrong client cert, wrong CA, thumbprint mismatch.
RBAC: scope enforcement per audience; over‑privileged client denied.
Rotation: JWKS rotation while load‑testing; zero‑downtime verification.
HA: kill one Authority instance; verify issuance continues; JWKS served by peers.
Performance: 1k token issuance/sec on 2 cores with Redis enabled for jti caching.

16) Threat model & mitigations (summary)

Threat	Vector	Mitigation
Token theft	Copy of JWT	Short TTL, sender‑constraint (DPoP/mTLS); replay blocked by `jti` cache and nonces
Replay across hosts	Reuse DPoP proof	Enforce `htu`/`htm`, `iat` freshness, `jti` uniqueness; services may require nonce
Impersonation	Fake client	mTLS or `private_key_jwt` with pinned JWK; client registration & rotation
Key compromise	Signing key leak	HSM/KMS storage, key rotation, audit; emergency key revoke path; narrow token TTL
Cross‑tenant abuse	Scope elevation	Enforce `aud`, `tid`, `inst` at issuance and resource servers
Downgrade to bearer	Strip DPoP	Resource servers require DPoP/mTLS based on `aud`; reject bearer without `cnf`

17) Deployment & HA

Stateless microservice, containerized; run ≥ 2 replicas behind LB.
DB: HA Postgres (or MySQL) for clients/roles; Redis for device codes, DPoP nonces/jtis.
Secrets: mount client JWKs via K8s Secrets/HashiCorp Vault; signing keys via KMS.
Backups: DB daily; Redis not critical (ephemeral).
Disaster recovery: export/import of client registry; JWKS rehydrate from KMS.
Compliance: TLS audit; penetration testing for OIDC flows.

18) Implementation notes

Reference stack: .NET 10 + OpenIddict 6 (or IdentityServer if licensed) with custom DPoP validator and mTLS binding middleware.
Keep the DPoP/JTI cache pluggable; allow Redis/Memcached.
Provide client SDKs for C# and Go: DPoP key mgmt, proof generation, nonce handling, token refresh helper.

19) Quick reference — wire examples

Access token (payload excerpt)

{
  "iss": "https://authority.internal",
  "sub": "scanner-web",
  "aud": "signer",
  "exp": 1760668800,
  "iat": 1760668620,
  "nbf": 1760668620,
  "jti": "9d9c3f01-6e1a-49f1-8f77-9b7e6f7e3c50",
  "scope": "signer.sign",
  "tid": "tenant-01",
  "inst": "install-7A2B",
  "cnf": { "jkt": "KcVb2V...base64url..." }
}

DPoP proof header fields (for POST /sign/dsse)

{
  "htu": "https://signer.internal/sign/dsse",
  "htm": "POST",
  "iat": 1760668620,
  "jti": "4b1c9b3c-8a95-4c58-8a92-9c6cfb4a6a0b"
}

Signer validates that hash(JWK) in the proof matches cnf.jkt in the token.

20) Rollout plan

MVP: Client Credentials (private_key_jwt + DPoP), JWKS, short OpToks, per‑audience scopes.
Add: mTLS‑bound tokens for Signer/Attestor; device code for CLI; optional introspection.
Hardening: DPoP nonce support; full audit pipeline; HA tuning.
UX: Tenant/installation admin UI; role→scope editors; client bootstrap wizards.

18 KiB Raw Blame History Unescape Escape

component_architecture_authority.md — Stella Ops Authority (2025Q4)