# Rate Limit Design Contract **Contract ID:** CONTRACT-RATE-LIMIT-001 **Status:** APPROVED **Effective Date:** 2025-12-07 **Owners:** Platform Reliability Guild, Gateway Guild ## Overview This contract defines the rate limiting design for StellaOps API endpoints, ensuring fair resource allocation, protection against abuse, and consistent client experience across all services. ## Rate Limiting Strategy ### Tiered Rate Limits | Tier | Requests/Minute | Requests/Hour | Burst Limit | Typical Use Case | |------|-----------------|---------------|-------------|------------------| | **Free** | 60 | 1,000 | 10 | Evaluation, small projects | | **Standard** | 300 | 10,000 | 50 | Production workloads | | **Enterprise** | 1,000 | 50,000 | 200 | Large-scale deployments | | **Unlimited** | No limit | No limit | No limit | Internal services, VIP | ### Per-Endpoint Rate Limits Some endpoints have additional rate limits based on resource intensity: | Endpoint Category | Rate Limit | Rationale | |-------------------|------------|-----------| | `/api/risk/simulation/*` | 30/min | CPU-intensive simulation | | `/api/risk/simulation/studio/*` | 10/min | Full breakdown analysis | | `/system/airgap/seal` | 5/hour | Critical state change | | `/policy/decisions` | 100/min | Lightweight evaluation | | `/api/policy/packs/*/bundle` | 10/min | Bundle compilation | | Export endpoints | 20/min | I/O-intensive operations | ## Implementation ### Algorithm Use **Token Bucket** algorithm with the following configuration: ```yaml rate_limit: algorithm: token_bucket bucket_size: ${BURST_LIMIT} refill_rate: ${REQUESTS_PER_MINUTE} / 60 refill_interval: 1s ``` ### Rate Limit Headers All responses include standard rate limit headers: ```http X-RateLimit-Limit: 300 X-RateLimit-Remaining: 295 X-RateLimit-Reset: 1701936000 X-RateLimit-Policy: standard Retry-After: 30 ``` ### Rate Limit Response When rate limit is exceeded, return: ```http HTTP/1.1 429 Too Many Requests Content-Type: application/problem+json Retry-After: 30 { "type": "https://stellaops.org/problems/rate-limit-exceeded", "title": "Rate Limit Exceeded", "status": 429, "detail": "You have exceeded your rate limit of 300 requests per minute.", "instance": "/api/risk/simulation", "limit": 300, "remaining": 0, "reset": 1701936000, "retryAfter": 30 } ``` ## Rate Limit Keys ### Primary Key: Tenant ID + Client ID ``` rate_limit_key = "${tenant_id}:${client_id}" ``` ### Fallback Keys 1. Authenticated: `tenant:${tenant_id}:user:${user_id}` 2. API Key: `apikey:${api_key_hash}` 3. Anonymous: `ip:${client_ip}` ## Exemptions ### Exempt Endpoints The following endpoints are exempt from rate limiting: - `GET /health` - `GET /ready` - `GET /metrics` - `GET /.well-known/*` ### Exempt Clients - Internal service mesh traffic (mTLS authenticated) - Localhost connections in development mode - Clients with `unlimited` tier ## Quota Management ### Tenant Quota Tracking ```yaml quota: tracking: storage: redis key_prefix: "stellaops:quota:" ttl: 3600 # 1 hour rolling window dimensions: - tenant_id - endpoint_category - time_bucket ``` ### Quota Alerts | Threshold | Action | |-----------|--------| | 80% consumed | Emit `quota.warning` event | | 95% consumed | Emit `quota.critical` event | | 100% consumed | Block requests, emit `quota.exceeded` event | ## Configuration ### Gateway Configuration ```yaml # gateway/rate-limits.yaml rateLimiting: enabled: true defaultTier: standard tiers: free: requestsPerMinute: 60 requestsPerHour: 1000 burstLimit: 10 standard: requestsPerMinute: 300 requestsPerHour: 10000 burstLimit: 50 enterprise: requestsPerMinute: 1000 requestsPerHour: 50000 burstLimit: 200 endpoints: - pattern: "/api/risk/simulation/*" limit: 30 window: 60s - pattern: "/api/risk/simulation/studio/*" limit: 10 window: 60s - pattern: "/system/airgap/seal" limit: 5 window: 3600s ``` ### Policy Engine Configuration ```csharp // PolicyEngineRateLimitOptions.cs public static class PolicyEngineRateLimitOptions { public const string PolicyName = "PolicyEngineRateLimit"; public static void Configure(RateLimiterOptions options) { options.AddTokenBucketLimiter(PolicyName, opt => { opt.TokenLimit = 50; opt.QueueLimit = 10; opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10); opt.TokensPerPeriod = 5; opt.AutoReplenishment = true; }); } } ``` ## Monitoring ### Metrics | Metric | Type | Labels | |--------|------|--------| | `stellaops_rate_limit_requests_total` | Counter | tier, endpoint, status | | `stellaops_rate_limit_exceeded_total` | Counter | tier, endpoint | | `stellaops_rate_limit_remaining` | Gauge | tenant_id, tier | | `stellaops_rate_limit_queue_size` | Gauge | endpoint | ### Alerts ```yaml # prometheus/rules/rate-limiting.yaml groups: - name: rate_limiting rules: - alert: HighRateLimitExceeded expr: rate(stellaops_rate_limit_exceeded_total[5m]) > 10 for: 5m labels: severity: warning annotations: summary: "High rate of rate limit exceeded events" ``` ## Integration with Web UI ### Client SDK Configuration ```typescript // stellaops-sdk/rate-limit-handler.ts interface RateLimitConfig { retryOnRateLimit: boolean; maxRetries: number; backoffMultiplier: number; maxBackoffSeconds: number; } const defaultConfig: RateLimitConfig = { retryOnRateLimit: true, maxRetries: 3, backoffMultiplier: 2, maxBackoffSeconds: 60 }; ``` ### UI Rate Limit Display The Web UI displays rate limit status in the console header with: - Current remaining requests - Time until reset - Visual indicator when approaching limit (< 20% remaining) ## Changelog | Date | Version | Change | |------|---------|--------| | 2025-12-07 | 1.0.0 | Initial contract definition | ## References - [API Governance Baseline](./api-governance-baseline.md) - [Web Gateway Architecture](../modules/gateway/architecture.md) - [Policy Engine Rate Limiting](../modules/policy/design/rate-limiting.md)