Doctor plugin checks: implement health check classes and documentation
Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
58
docs/doctor/articles/notify/webhook-connectivity.md
Normal file
58
docs/doctor/articles/notify/webhook-connectivity.md
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
checkId: check.notify.webhook.connectivity
|
||||
plugin: stellaops.doctor.notify
|
||||
severity: warn
|
||||
tags: [notify, webhook, connectivity, network]
|
||||
---
|
||||
# Webhook Connectivity
|
||||
|
||||
## What It Checks
|
||||
Verifies that the configured generic webhook endpoint is reachable. The check:
|
||||
|
||||
- Sends a HEAD request to the webhook URL (falls back to OPTIONS if HEAD is unsupported) with a 10-second timeout.
|
||||
- Any response with HTTP status < 500 is considered reachable (even 401/403, which indicate the endpoint exists but requires authentication).
|
||||
- Warns on HTTP 5xx responses (server-side errors).
|
||||
- Fails on connection timeout or HTTP request exceptions.
|
||||
|
||||
The check only runs when `Notify:Channels:Webhook:Url` (or `Endpoint`) is set and is a valid absolute URL.
|
||||
|
||||
## Why It Matters
|
||||
A configured but unreachable webhook endpoint means third-party integrations silently stop receiving notifications. Events that should trigger PagerDuty alerts, SIEM ingestion, or custom dashboard updates will be lost.
|
||||
|
||||
## Common Causes
|
||||
- Endpoint server not responding
|
||||
- Network connectivity issue or firewall blocking connection
|
||||
- DNS resolution failure
|
||||
- TLS/SSL certificate problem on the endpoint
|
||||
- Webhook endpoint service is down
|
||||
|
||||
## How to Fix
|
||||
|
||||
### Docker Compose
|
||||
```bash
|
||||
docker exec <notify-container> curl -v --max-time 10 https://your-endpoint/webhook
|
||||
docker exec <notify-container> nslookup your-endpoint
|
||||
```
|
||||
|
||||
### Bare Metal / systemd
|
||||
```bash
|
||||
curl -I https://your-endpoint/webhook
|
||||
nslookup your-endpoint
|
||||
nc -zv your-endpoint 443
|
||||
```
|
||||
|
||||
### Kubernetes / Helm
|
||||
```bash
|
||||
kubectl exec -it <notify-pod> -- curl -v https://your-endpoint/webhook
|
||||
```
|
||||
|
||||
Check that egress NetworkPolicies allow traffic to the webhook destination.
|
||||
|
||||
## Verification
|
||||
```
|
||||
stella doctor run --check check.notify.webhook.connectivity
|
||||
```
|
||||
|
||||
## Related Checks
|
||||
- `check.notify.webhook.configured` — verifies webhook URL is set and valid
|
||||
- `check.notify.queue.health` — verifies the notification delivery queue is healthy
|
||||
Reference in New Issue
Block a user