6.1 KiB
6.1 KiB
Rate Limit Design Contract
Contract ID: CONTRACT-RATE-LIMIT-001 Status: APPROVED Effective Date: 2025-12-07 Owners: Platform Reliability Guild, Gateway Guild
Overview
This contract defines the rate limiting design for StellaOps API endpoints, ensuring fair resource allocation, protection against abuse, and consistent client experience across all services.
Rate Limiting Strategy
Tiered Rate Limits
| Tier | Requests/Minute | Requests/Hour | Burst Limit | Typical Use Case |
|---|---|---|---|---|
| Free | 60 | 1,000 | 10 | Evaluation, small projects |
| Standard | 300 | 10,000 | 50 | Production workloads |
| Enterprise | 1,000 | 50,000 | 200 | Large-scale deployments |
| Unlimited | No limit | No limit | No limit | Internal services, VIP |
Per-Endpoint Rate Limits
Some endpoints have additional rate limits based on resource intensity:
| Endpoint Category | Rate Limit | Rationale |
|---|---|---|
/api/risk/simulation/* |
30/min | CPU-intensive simulation |
/api/risk/simulation/studio/* |
10/min | Full breakdown analysis |
/system/airgap/seal |
5/hour | Critical state change |
/policy/decisions |
100/min | Lightweight evaluation |
/api/policy/packs/*/bundle |
10/min | Bundle compilation |
| Export endpoints | 20/min | I/O-intensive operations |
Implementation
Algorithm
Use Token Bucket algorithm with the following configuration:
rate_limit:
algorithm: token_bucket
bucket_size: ${BURST_LIMIT}
refill_rate: ${REQUESTS_PER_MINUTE} / 60
refill_interval: 1s
Rate Limit Headers
All responses include standard rate limit headers:
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 295
X-RateLimit-Reset: 1701936000
X-RateLimit-Policy: standard
Retry-After: 30
Rate Limit Response
When rate limit is exceeded, return:
HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 30
{
"type": "https://stellaops.org/problems/rate-limit-exceeded",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "You have exceeded your rate limit of 300 requests per minute.",
"instance": "/api/risk/simulation",
"limit": 300,
"remaining": 0,
"reset": 1701936000,
"retryAfter": 30
}
Rate Limit Keys
Primary Key: Tenant ID + Client ID
rate_limit_key = "${tenant_id}:${client_id}"
Fallback Keys
- Authenticated:
tenant:${tenant_id}:user:${user_id} - API Key:
apikey:${api_key_hash} - Anonymous:
ip:${client_ip}
Exemptions
Exempt Endpoints
The following endpoints are exempt from rate limiting:
GET /healthGET /readyGET /metricsGET /.well-known/*
Exempt Clients
- Internal service mesh traffic (mTLS authenticated)
- Localhost connections in development mode
- Clients with
unlimitedtier
Quota Management
Tenant Quota Tracking
quota:
tracking:
storage: redis # Valkey (Redis-compatible)
key_prefix: "stellaops:quota:"
ttl: 3600 # 1 hour rolling window
dimensions:
- tenant_id
- endpoint_category
- time_bucket
Quota Alerts
| Threshold | Action |
|---|---|
| 80% consumed | Emit quota.warning event |
| 95% consumed | Emit quota.critical event |
| 100% consumed | Block requests, emit quota.exceeded event |
Configuration
Gateway Configuration
# gateway/rate-limits.yaml
rateLimiting:
enabled: true
defaultTier: standard
tiers:
free:
requestsPerMinute: 60
requestsPerHour: 1000
burstLimit: 10
standard:
requestsPerMinute: 300
requestsPerHour: 10000
burstLimit: 50
enterprise:
requestsPerMinute: 1000
requestsPerHour: 50000
burstLimit: 200
endpoints:
- pattern: "/api/risk/simulation/*"
limit: 30
window: 60s
- pattern: "/api/risk/simulation/studio/*"
limit: 10
window: 60s
- pattern: "/system/airgap/seal"
limit: 5
window: 3600s
Policy Engine Configuration
// PolicyEngineRateLimitOptions.cs
public static class PolicyEngineRateLimitOptions
{
public const string PolicyName = "PolicyEngineRateLimit";
public static void Configure(RateLimiterOptions options)
{
options.AddTokenBucketLimiter(PolicyName, opt =>
{
opt.TokenLimit = 50;
opt.QueueLimit = 10;
opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
opt.TokensPerPeriod = 5;
opt.AutoReplenishment = true;
});
}
}
Monitoring
Metrics
| Metric | Type | Labels |
|---|---|---|
stellaops_rate_limit_requests_total |
Counter | tier, endpoint, status |
stellaops_rate_limit_exceeded_total |
Counter | tier, endpoint |
stellaops_rate_limit_remaining |
Gauge | tenant_id, tier |
stellaops_rate_limit_queue_size |
Gauge | endpoint |
Alerts
# prometheus/rules/rate-limiting.yaml
groups:
- name: rate_limiting
rules:
- alert: HighRateLimitExceeded
expr: rate(stellaops_rate_limit_exceeded_total[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High rate of rate limit exceeded events"
Integration with Web UI
Client SDK Configuration
// stellaops-sdk/rate-limit-handler.ts
interface RateLimitConfig {
retryOnRateLimit: boolean;
maxRetries: number;
backoffMultiplier: number;
maxBackoffSeconds: number;
}
const defaultConfig: RateLimitConfig = {
retryOnRateLimit: true,
maxRetries: 3,
backoffMultiplier: 2,
maxBackoffSeconds: 60
};
UI Rate Limit Display
The Web UI displays rate limit status in the console header with:
- Current remaining requests
- Time until reset
- Visual indicator when approaching limit (< 20% remaining)
Changelog
| Date | Version | Change |
|---|---|---|
| 2025-12-07 | 1.0.0 | Initial contract definition |