Files
git.stella-ops.org/docs/security/rate-limits.md
master 79823d3319 up
2025-10-15 10:03:56 +03:00

77 lines
3.7 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# StellaOps Authority Rate Limit Guidance
StellaOps Authority applies fixed-window rate limiting to critical endpoints so that brute-force and burst traffic are throttled before they can exhaust downstream resources. This guide complements the lockout policy documentation and captures the recommended defaults, override scenarios, and monitoring practices for `/token`, `/authorize`, and `/internal/*` routes.
## Configuration Overview
Rate limits live under `security.rateLimiting` in `authority.yaml` (and map to the same hierarchy for environment variables). Each endpoint exposes:
- `enabled` — toggles the limiter.
- `permitLimit` — maximum requests per fixed window.
- `window` — window duration expressed as an ISO-8601 timespan (e.g., `00:01:00`).
- `queueLimit` — number of requests allowed to queue when the window is exhausted.
```yaml
security:
rateLimiting:
token:
enabled: true
permitLimit: 30
window: 00:01:00
queueLimit: 0
authorize:
enabled: true
permitLimit: 60
window: 00:01:00
queueLimit: 10
internal:
enabled: false
permitLimit: 5
window: 00:01:00
queueLimit: 0
```
When limits trigger, middleware decorates responses with `Retry-After` headers and log tags (`authority.endpoint`, `authority.client_id`, `authority.remote_ip`) so operators can correlate events with clients and source IPs.
Environment overrides follow the same hierarchy. For example:
```
STELLAOPS_AUTHORITY__SECURITY__RATELIMITING__TOKEN__PERMITLIMIT=60
STELLAOPS_AUTHORITY__SECURITY__RATELIMITING__TOKEN__WINDOW=00:01:00
```
## Recommended Profiles
| Scenario | permitLimit | window | queueLimit | Notes |
|----------|-------------|--------|------------|-------|
| Default production | 30 | 60s | 0 | Balances anonymous quota (33 scans/day) with headroom for tenant bursts. |
| High-trust clustered IPs | 60 | 60s | 5 | Requires WAF allowlist + alert `aspnetcore_rate_limiting_rejections_total{limiter="authority-token"} <= 1%` sustained. |
| Air-gapped lab | 10 | 120s | 0 | Lower concurrency reduces noise when running from shared bastion hosts. |
| Incident lockdown | 5 | 300s | 0 | Pair with credential lockout limit of 3 attempts and SOC paging for each denial. |
### Lockout Interplay
- Rate limiting throttles by IP/client; lockout policies apply per subject. Keep both enabled.
- During lockdown scenarios, reduce `security.lockout.maxFailures` alongside the rate limits above so that subjects face quicker escalation.
- Map support playbooks to the observed `Retry-After` value: anything above 120 seconds should trigger manual investigation before re-enabling clients.
## Monitoring and Alerts
1. **Metrics**
- `aspnetcore_rate_limiting_rejections_total{limiter="authority-token"}` for `/token`.
- `aspnetcore_rate_limiting_rejections_total{limiter="authority-authorize"}` for `/authorize`.
- Custom counters derived from the structured log tags (`authority.remote_ip`, `authority.client_id`).
2. **Dashboards**
- Requests vs. rejections per endpoint.
- Top offending clients/IP ranges in the current window.
- Heatmap of retry-after durations to spot persistent throttling.
3. **Alerts**
- Notify SOC when 429 rates exceed 25% for five consecutive minutes on any limiter.
- Trigger client-specific alerts when a single client_id produces >100 throttle events/hour.
## Operational Checklist
- Validate updated limits in staging before production rollout; smoke-test with representative workload.
- When raising limits, confirm audit events continue to capture `authority.client_id`, `authority.remote_ip`, and correlation IDs for throttle responses.
- Document any overrides in the change log, including justification and expiry review date.