doctor: complete runtime check documentation sprint

Signed-off-by: master <>
This commit is contained in:
master
2026-03-31 23:26:24 +03:00
parent 404d50bcb7
commit 152c1b1357
54 changed files with 2210 additions and 258 deletions

View File

@@ -0,0 +1,56 @@
---
checkId: check.servicegraph.mq
plugin: stellaops.doctor.servicegraph
severity: warn
tags: [servicegraph, messaging, rabbitmq, connectivity]
---
# Message Queue Connectivity
## What It Checks
Reads `RabbitMQ:Host` or `Messaging:RabbitMQ:Host` plus an optional port, defaulting to `5672`, and attempts a TCP connection.
The check skips when RabbitMQ is not configured and fails on timeouts, DNS failures, or refused connections.
## Why It Matters
Release tasks, notifications, and deferred work often depend on a functioning message broker. A dead queue path turns healthy APIs into backlogged systems.
## Common Causes
- `RabbitMQ__Host` is unset or points to the wrong broker
- The broker container is down
- AMQP traffic is blocked between Doctor and RabbitMQ
## How to Fix
### Docker Compose
```yaml
services:
doctor-web:
environment:
RabbitMQ__Host: rabbitmq
RabbitMQ__Port: "5672"
```
```bash
docker compose -f devops/compose/docker-compose.stella-ops.yml ps rabbitmq
docker compose -f devops/compose/docker-compose.stella-ops.yml logs --tail 100 rabbitmq
docker compose -f devops/compose/docker-compose.stella-ops.yml exec doctor-web sh -lc "nc -zv rabbitmq 5672"
```
### Bare Metal / systemd
```bash
nc -zv <rabbit-host> 5672
```
### Kubernetes / Helm
```bash
kubectl exec deploy/doctor-web -n <namespace> -- sh -lc "nc -zv <rabbit-service> 5672"
```
## Verification
```bash
stella doctor --check check.servicegraph.mq
```
## Related Checks
- `check.servicegraph.valkey` - cache and queue connectivity usually fail together when service networking is broken
- `check.servicegraph.timeouts` - aggressive timeouts can make a slow broker look unavailable