# Router chaos testing Purpose - Validate backpressure, recovery, and cache failure behavior for the router. Test categories - Load testing with spike scenarios (baseline, 10x, 50x, recovery). - Backpressure verification for 429 and 503 with Retry-After. - Recovery tests to ensure queues drain quickly. - Valkey failure injection with graceful fallback. Expected behavior - Normal load returns 200 OK. - High load returns 429 with Retry-After. - Critical load returns 503 with Retry-After. - Recovery within 30 seconds, zero data loss. Metrics - http_requests_total{status} - router_request_queue_depth - request_recovery_seconds Alert cues - Throttle rate above 10% for 5 minutes. - P95 recovery time above 30 seconds. - Missing Retry-After headers. CI integration - Runs on PRs touching router code and nightly staging runs. - Stores results as artifacts for audits. Related references - operations/router-rate-limiting.md - docs/operations/router-chaos-testing-runbook.md