Files
git.stella-ops.org/docs/contracts/rate-limit-design.md
StellaOps Bot e53a282fbe
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
feat: Add native binary analyzer test utilities and implement SM2 signing tests
- Introduced `NativeTestBase` class for ELF, PE, and Mach-O binary parsing helpers and assertions.
- Created `TestCryptoFactory` for SM2 cryptographic provider setup and key generation.
- Implemented `Sm2SigningTests` to validate signing functionality with environment gate checks.
- Developed console export service and store with comprehensive unit tests for export status management.
2025-12-07 13:12:41 +02:00

264 lines
6.1 KiB
Markdown

# Rate Limit Design Contract
**Contract ID:** CONTRACT-RATE-LIMIT-001
**Status:** APPROVED
**Effective Date:** 2025-12-07
**Owners:** Platform Reliability Guild, Gateway Guild
## Overview
This contract defines the rate limiting design for StellaOps API endpoints, ensuring fair resource allocation, protection against abuse, and consistent client experience across all services.
## Rate Limiting Strategy
### Tiered Rate Limits
| Tier | Requests/Minute | Requests/Hour | Burst Limit | Typical Use Case |
|------|-----------------|---------------|-------------|------------------|
| **Free** | 60 | 1,000 | 10 | Evaluation, small projects |
| **Standard** | 300 | 10,000 | 50 | Production workloads |
| **Enterprise** | 1,000 | 50,000 | 200 | Large-scale deployments |
| **Unlimited** | No limit | No limit | No limit | Internal services, VIP |
### Per-Endpoint Rate Limits
Some endpoints have additional rate limits based on resource intensity:
| Endpoint Category | Rate Limit | Rationale |
|-------------------|------------|-----------|
| `/api/risk/simulation/*` | 30/min | CPU-intensive simulation |
| `/api/risk/simulation/studio/*` | 10/min | Full breakdown analysis |
| `/system/airgap/seal` | 5/hour | Critical state change |
| `/policy/decisions` | 100/min | Lightweight evaluation |
| `/api/policy/packs/*/bundle` | 10/min | Bundle compilation |
| Export endpoints | 20/min | I/O-intensive operations |
## Implementation
### Algorithm
Use **Token Bucket** algorithm with the following configuration:
```yaml
rate_limit:
algorithm: token_bucket
bucket_size: ${BURST_LIMIT}
refill_rate: ${REQUESTS_PER_MINUTE} / 60
refill_interval: 1s
```
### Rate Limit Headers
All responses include standard rate limit headers:
```http
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 295
X-RateLimit-Reset: 1701936000
X-RateLimit-Policy: standard
Retry-After: 30
```
### Rate Limit Response
When rate limit is exceeded, return:
```http
HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 30
```
## Rate Limit Keys
### Primary Key: Tenant ID + Client ID
```
rate_limit_key = "${tenant_id}:${client_id}"
```
### Fallback Keys
1. Authenticated: `tenant:${tenant_id}:user:${user_id}`
2. API Key: `apikey:${api_key_hash}`
3. Anonymous: `ip:${client_ip}`
## Exemptions
### Exempt Endpoints
The following endpoints are exempt from rate limiting:
- `GET /health`
- `GET /ready`
- `GET /metrics`
- `GET /.well-known/*`
### Exempt Clients
- Internal service mesh traffic (mTLS authenticated)
- Localhost connections in development mode
- Clients with `unlimited` tier
## Quota Management
### Tenant Quota Tracking
```yaml
quota:
tracking:
storage: redis
key_prefix: "stellaops:quota:"
ttl: 3600 # 1 hour rolling window
dimensions:
- tenant_id
- endpoint_category
- time_bucket
```
### Quota Alerts
| Threshold | Action |
|-----------|--------|
| 80% consumed | Emit `quota.warning` event |
| 95% consumed | Emit `quota.critical` event |
| 100% consumed | Block requests, emit `quota.exceeded` event |
## Configuration
### Gateway Configuration
```yaml
# gateway/rate-limits.yaml
rateLimiting:
enabled: true
defaultTier: standard
tiers:
free:
requestsPerMinute: 60
requestsPerHour: 1000
burstLimit: 10
standard:
requestsPerMinute: 300
requestsPerHour: 10000
burstLimit: 50
enterprise:
requestsPerMinute: 1000
requestsPerHour: 50000
burstLimit: 200
endpoints:
- pattern: "/api/risk/simulation/*"
limit: 30
window: 60s
- pattern: "/api/risk/simulation/studio/*"
limit: 10
window: 60s
- pattern: "/system/airgap/seal"
limit: 5
window: 3600s
```
### Policy Engine Configuration
```csharp
// PolicyEngineRateLimitOptions.cs
public static class PolicyEngineRateLimitOptions
{
public const string PolicyName = "PolicyEngineRateLimit";
public static void Configure(RateLimiterOptions options)
{
options.AddTokenBucketLimiter(PolicyName, opt =>
{
opt.TokenLimit = 50;
opt.QueueLimit = 10;
opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
opt.TokensPerPeriod = 5;
opt.AutoReplenishment = true;
});
}
}
```
## Monitoring
### Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `stellaops_rate_limit_requests_total` | Counter | tier, endpoint, status |
| `stellaops_rate_limit_exceeded_total` | Counter | tier, endpoint |
| `stellaops_rate_limit_remaining` | Gauge | tenant_id, tier |
| `stellaops_rate_limit_queue_size` | Gauge | endpoint |
### Alerts
```yaml
# prometheus/rules/rate-limiting.yaml
groups:
- name: rate_limiting
rules:
- alert: HighRateLimitExceeded
expr: rate(stellaops_rate_limit_exceeded_total[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High rate of rate limit exceeded events"
```
## Integration with Web UI
### Client SDK Configuration
```typescript
// stellaops-sdk/rate-limit-handler.ts
interface RateLimitConfig {
retryOnRateLimit: boolean;
maxRetries: number;
backoffMultiplier: number;
maxBackoffSeconds: number;
}
const defaultConfig: RateLimitConfig = {
retryOnRateLimit: true,
maxRetries: 3,
backoffMultiplier: 2,
maxBackoffSeconds: 60
};
```
### UI Rate Limit Display
The Web UI displays rate limit status in the console header with:
- Current remaining requests
- Time until reset
- Visual indicator when approaching limit (< 20% remaining)
## Changelog
| Date | Version | Change |
|------|---------|--------|
| 2025-12-07 | 1.0.0 | Initial contract definition |
## References
- [API Governance Baseline](./api-governance-baseline.md)
- [Web Gateway Architecture](../modules/gateway/architecture.md)
- [Policy Engine Rate Limiting](../modules/policy/design/rate-limiting.md)