StellaOps Authority Rate Limit Guidance

StellaOps Authority applies fixed-window rate limiting to critical endpoints so that brute-force and burst traffic are throttled before they can exhaust downstream resources. This guide complements the lockout policy documentation and captures the recommended defaults, override scenarios, and monitoring practices for /token, /authorize, and /internal/* routes.

Configuration Overview

Rate limits live under security.rateLimiting in authority.yaml (and map to the same hierarchy for environment variables). Each endpoint exposes:

enabled — toggles the limiter.
permitLimit — maximum requests per fixed window.
window — window duration expressed as an ISO-8601 timespan (e.g., 00:01:00).
queueLimit — number of requests allowed to queue when the window is exhausted.

security:
  rateLimiting:
    token:
      enabled: true
      permitLimit: 30
      window: 00:01:00
      queueLimit: 0
    authorize:
      enabled: true
      permitLimit: 60
      window: 00:01:00
      queueLimit: 10
    internal:
      enabled: false
      permitLimit: 5
      window: 00:01:00
      queueLimit: 0

When limits trigger, middleware decorates responses with Retry-After headers and log tags (authority.endpoint, authority.client_id, authority.remote_ip) so operators can correlate events with clients and source IPs.

Environment overrides follow the same hierarchy. For example:

STELLAOPS_AUTHORITY__SECURITY__RATELIMITING__TOKEN__PERMITLIMIT=60
STELLAOPS_AUTHORITY__SECURITY__RATELIMITING__TOKEN__WINDOW=00:01:00

Recommended Profiles

Scenario	permitLimit	window	queueLimit	Notes
Default production	30	60s	0	Balances anonymous quota (33 scans/day) with headroom for tenant bursts.
High-trust clustered IPs	60	60s	5	Requires WAF allowlist + alert `aspnetcore_rate_limiting_rejections_total{limiter="authority-token"} <= 1%` sustained.
Air-gapped lab	10	120s	0	Lower concurrency reduces noise when running from shared bastion hosts.
Incident lockdown	5	300s	0	Pair with credential lockout limit of 3 attempts and SOC paging for each denial.

Lockout Interplay

Rate limiting throttles by IP/client; lockout policies apply per subject. Keep both enabled.
During lockdown scenarios, reduce security.lockout.maxFailures alongside the rate limits above so that subjects face quicker escalation.
Map support playbooks to the observed Retry-After value: anything above 120 seconds should trigger manual investigation before re-enabling clients.

Monitoring and Alerts

Metrics
- aspnetcore_rate_limiting_rejections_total{limiter="authority-token"} for /token.
- aspnetcore_rate_limiting_rejections_total{limiter="authority-authorize"} for /authorize.
- Custom counters derived from the structured log tags (authority.remote_ip, authority.client_id).
Dashboards
- Requests vs. rejections per endpoint.
- Top offending clients/IP ranges in the current window.
- Heatmap of retry-after durations to spot persistent throttling.
Alerts
- Notify SOC when 429 rates exceed 25 % for five consecutive minutes on any limiter.
- Trigger client-specific alerts when a single client_id produces >100 throttle events/hour.

Operational Checklist

Validate updated limits in staging before production rollout; smoke-test with representative workload.
When raising limits, confirm audit events continue to capture authority.client_id, authority.remote_ip, and correlation IDs for throttle responses.
Document any overrides in the change log, including justification and expiry review date.

3.7 KiB Raw Blame History Unescape Escape