Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,99 @@
---
checkId: check.security.ratelimit
plugin: stellaops.doctor.security
severity: warn
tags: [security, ratelimit, api]
---
# Rate Limiting
## What It Checks
Validates that rate limiting is configured to prevent API abuse. The check inspects `RateLimiting:*` and `Security:RateLimiting:*` configuration sections:
| Condition | Result |
|---|---|
| `Enabled` not set at all | `info` — rate limiting configuration not found |
| `Enabled` is `false` | `warn` — rate limiting explicitly disabled |
| `PermitLimit` > 10,000 | `warn` — permit count very high |
| `WindowSeconds` < 1 | `warn` window too short |
| `WindowSeconds` > 3,600 | `warn` — window too long for burst prevention |
| Effective rate > 1,000 req/s | `warn` — rate may be too permissive |
The effective rate is calculated as `PermitLimit / WindowSeconds`.
Default values if not explicitly set: `PermitLimit` = 100, `WindowSeconds` = 60, `QueueLimit` = 0.
Evidence collected includes: enabled state, permit limit, window seconds, queue limit, and effective requests per second.
## Why It Matters
Without rate limiting, the API is vulnerable to denial-of-service attacks, credential-stuffing, and resource exhaustion. A single client or compromised API key can overwhelm the service, affecting all users. Rate limiting is especially important for:
- Login endpoints (prevents brute-force attacks)
- Scan submission endpoints (prevents resource exhaustion)
- Evidence upload endpoints (prevents storage exhaustion)
## Common Causes
- Rate limiting explicitly disabled in configuration
- Rate limiting configuration section not present
- Permit limit set too high (greater than 10,000 per window)
- Rate limit window too short (less than 1 second) or too long (greater than 1 hour)
- Effective rate too permissive (more than 1,000 requests per second)
## How to Fix
### Docker Compose
Set rate limiting configuration:
```yaml
environment:
RateLimiting__Enabled: "true"
RateLimiting__PermitLimit: "100"
RateLimiting__WindowSeconds: "60"
RateLimiting__QueueLimit: "10"
```
### Bare Metal / systemd
Edit `appsettings.json`:
```json
{
"RateLimiting": {
"Enabled": true,
"PermitLimit": 100,
"WindowSeconds": 60,
"QueueLimit": 10
}
}
```
### Kubernetes / Helm
Set in Helm values:
```yaml
rateLimiting:
enabled: true
permitLimit: 100
windowSeconds: 60
queueLimit: 10
```
For stricter per-endpoint limits, configure additional policies:
```yaml
rateLimiting:
policies:
login:
permitLimit: 10
windowSeconds: 300
scan:
permitLimit: 20
windowSeconds: 60
```
## Verification
```
stella doctor run --check check.security.ratelimit
```
## Related Checks
- `check.security.apikey` — per-key rate limiting for API key authentication
- `check.security.password.policy` — lockout policy provides complementary brute-force protection