Some checks failed
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
34 lines
2.8 KiB
Markdown
34 lines
2.8 KiB
Markdown
# Authority Rate Limit Tuning Outline (2025-10-11)
|
|
|
|
## Purpose
|
|
- Drive the remaining work on SEC3.B (Security Guild) and PLG6.DOC (Docs Guild) by capturing the agreed baseline for Authority rate limits and related documentation deliverables.
|
|
- Provide a single reference for lockout + rate limit interplay so Docs can lift accurate copy into `docs/security/rate-limits.md` and `docs/dev/31_AUTHORITY_PLUGIN_DEVELOPER_GUIDE.md`.
|
|
|
|
## Baseline Configuration
|
|
- `/token`: fixed window, permitLimit 30, window 60s, queueLimit 0. Reduce to 10/60s for untrusted IP ranges; raise to 60/60s only with compensating controls (WAF + active monitoring).
|
|
- `/authorize`: permitLimit 60, window 60s, queueLimit 10. Intended for interactive browser flows; lowering below 30 requires UX review.
|
|
- `/internal/*`: disabled by default; recommended 5/60s with queueLimit 0 when bootstrap API exposed.
|
|
- Configuration path: `authority.security.rateLimiting.<endpoint>` (e.g., `token.permitLimit`). YAML/ENV bindings follow the standard options hierarchy.
|
|
- Retry metadata: middleware stamps `Retry-After` along with tags `authority.client_id`, `authority.remote_ip`, `authority.endpoint`. Docs should highlight these for operator dashboards.
|
|
|
|
## Parameter Matrix
|
|
| Scenario | permitLimit | window | queueLimit | Notes |
|
|
|----------|-------------|--------|------------|-------|
|
|
| Default production | 30 | 60s | 0 | Works with anonymous quota (33 scans/day). |
|
|
| High-trust clustered IPs | 60 | 60s | 5 | Requires `authorize_rate_limit_hits` alert ≤ 1% sustained. |
|
|
| Air-gapped lab | 10 | 120s | 0 | Emphasise reduced concurrency + manual queue draining. |
|
|
| Incident lockdown | 5 | 300s | 0 | Pair with lockout lowering to 3 attempts. |
|
|
|
|
## Lockout Interplay
|
|
- Ensure Docs explain difference between rate limit (per IP/client) vs lockout (per subject). Provide table mapping retry-after headers to recommended support scripts.
|
|
- Security Guild to define alert thresholds: trigger SOC ticket when 429 rate > 25% for 5 minutes or when limiter emits >100 events/hour per client.
|
|
|
|
## Observability
|
|
- Surface metrics: `aspnetcore_rate_limiting_rejections_total{limiter="authority-token"}` and custom log tags from `AuthorityRateLimiterMetadataMiddleware`.
|
|
- Recommend dashboard sections: request volume vs. rejections, top offending clientIds, per-endpoint heatmap.
|
|
|
|
## Action Items
|
|
1. Security Guild (SEC3.B): incorporate matrix + alert rules into `docs/security/rate-limits.md`, add YAML examples for override blocks, and cross-link lockout policy doc.
|
|
2. Docs Guild (PLG6.DOC): update developer guide section 9 with the middleware sequence and reference this outline for retry metadata + tuning guidance.
|
|
3. Authority Core: validate appsettings sample includes the `security.rateLimiting` block with comments and link back to published doc once ready.
|