Files
git.stella-ops.org/docs/dev/authority-rate-limit-tuning-outline.md
master b97fc7685a
Some checks failed
Build Test Deploy / authority-container (push) Has been cancelled
Build Test Deploy / docs (push) Has been cancelled
Build Test Deploy / deploy (push) Has been cancelled
Build Test Deploy / build-test (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Initial commit (history squashed)
2025-10-11 23:28:35 +03:00

2.8 KiB

Authority Rate Limit Tuning Outline (2025-10-11)

Purpose

  • Drive the remaining work on SEC3.B (Security Guild) and PLG6.DOC (Docs Guild) by capturing the agreed baseline for Authority rate limits and related documentation deliverables.
  • Provide a single reference for lockout + rate limit interplay so Docs can lift accurate copy into docs/security/rate-limits.md and docs/dev/31_AUTHORITY_PLUGIN_DEVELOPER_GUIDE.md.

Baseline Configuration

  • /token: fixed window, permitLimit 30, window 60s, queueLimit 0. Reduce to 10/60s for untrusted IP ranges; raise to 60/60s only with compensating controls (WAF + active monitoring).
  • /authorize: permitLimit 60, window 60s, queueLimit 10. Intended for interactive browser flows; lowering below 30 requires UX review.
  • /internal/*: disabled by default; recommended 5/60s with queueLimit 0 when bootstrap API exposed.
  • Configuration path: authority.security.rateLimiting.<endpoint> (e.g., token.permitLimit). YAML/ENV bindings follow the standard options hierarchy.
  • Retry metadata: middleware stamps Retry-After along with tags authority.client_id, authority.remote_ip, authority.endpoint. Docs should highlight these for operator dashboards.

Parameter Matrix

Scenario permitLimit window queueLimit Notes
Default production 30 60s 0 Works with anonymous quota (33 scans/day).
High-trust clustered IPs 60 60s 5 Requires authorize_rate_limit_hits alert ≤ 1% sustained.
Air-gapped lab 10 120s 0 Emphasise reduced concurrency + manual queue draining.
Incident lockdown 5 300s 0 Pair with lockout lowering to 3 attempts.

Lockout Interplay

  • Ensure Docs explain difference between rate limit (per IP/client) vs lockout (per subject). Provide table mapping retry-after headers to recommended support scripts.
  • Security Guild to define alert thresholds: trigger SOC ticket when 429 rate > 25% for 5 minutes or when limiter emits >100 events/hour per client.

Observability

  • Surface metrics: aspnetcore_rate_limiting_rejections_total{limiter="authority-token"} and custom log tags from AuthorityRateLimiterMetadataMiddleware.
  • Recommend dashboard sections: request volume vs. rejections, top offending clientIds, per-endpoint heatmap.

Action Items

  1. Security Guild (SEC3.B): incorporate matrix + alert rules into docs/security/rate-limits.md, add YAML examples for override blocks, and cross-link lockout policy doc.
  2. Docs Guild (PLG6.DOC): update developer guide section 9 with the middleware sequence and reference this outline for retry metadata + tuning guidance.
  3. Authority Core: validate appsettings sample includes the security.rateLimiting block with comments and link back to published doc once ready.