---
checkId: check.scanner.queue
plugin: stellaops.doctor.scanner
severity: warn
tags: [scanner, queue, jobs, processing]
---
# Scanner Queue Health

## What It Checks
Queries the Scanner service at `/api/v1/queue/stats` and evaluates job queue health across four dimensions:

- **Queue depth**: warn at 100+ pending jobs, fail at 500+.
- **Failure rate**: warn at 5%+ of processed jobs failing, fail at 15%+.
- **Stuck jobs**: any stuck jobs trigger an immediate fail.
- **Backlog growth**: a growing backlog triggers a warning.

Evidence collected: `queue_depth`, `processing_rate_per_min`, `stuck_jobs`, `failed_jobs`, `failure_rate`, `oldest_job_age_min`, `backlog_growing`.

The check requires `Scanner:Url` or `Services:Scanner:Url` to be configured; otherwise it is skipped.

## Why It Matters
The scanner queue is the central work pipeline for SBOM generation, vulnerability scanning, and reachability analysis. A backlogged or stuck queue delays security findings, blocks release gates that depend on scan results, and can cascade into approval timeouts. Stuck jobs indicate a worker crash or resource failure that will not self-heal.

## Common Causes
- Scanner worker process crashed or was OOM-killed
- Job dependency (registry, database) became unavailable mid-scan
- Resource exhaustion (CPU, memory, disk) on the scanner host
- Database connection lost during job processing
- Sudden spike in image pushes overwhelming worker capacity
- Processing rate slower than ingest rate during bulk import

## How to Fix

### Docker Compose
Check scanner worker status and restart if needed:

```bash
# View scanner container logs for errors
docker compose -f docker-compose.stella-ops.yml logs --tail 200 scanner

# Restart the scanner service
docker compose -f docker-compose.stella-ops.yml restart scanner

# Scale scanner workers (if using replicas)
docker compose -f docker-compose.stella-ops.yml up -d --scale scanner=4
```

Adjust concurrency via environment variables:

```yaml
environment:
  Scanner__Queue__MaxConcurrentJobs: "4"
  Scanner__Queue__StuckJobTimeoutMinutes: "30"
```

### Bare Metal / systemd
```bash
# Check scanner service status
sudo systemctl status stellaops-scanner

# View recent logs
sudo journalctl -u stellaops-scanner --since "1 hour ago"

# Restart the service
sudo systemctl restart stellaops-scanner
```

Edit `/etc/stellaops/scanner/appsettings.json`:

```json
{
  "Queue": {
    "MaxConcurrentJobs": 4,
    "StuckJobTimeoutMinutes": 30
  }
}
```

### Kubernetes / Helm
```bash
# Check scanner pod status
kubectl get pods -l app=stellaops-scanner

# View logs for crash loops
kubectl logs -l app=stellaops-scanner --tail=200

# Scale scanner deployment
kubectl scale deployment stellaops-scanner --replicas=4
```

Set in Helm `values.yaml`:

```yaml
scanner:
  replicas: 4
  queue:
    maxConcurrentJobs: 4
    stuckJobTimeoutMinutes: 30
```

## Verification
```
stella doctor run --check check.scanner.queue
```

## Related Checks
- `check.scanner.resources` -- scanner CPU/memory utilization affecting processing rate
- `check.scanner.sbom` -- SBOM generation failures may originate from queue issues
- `check.scanner.vuln` -- vulnerability scan health depends on queue throughput
- `check.operations.job-queue` -- platform-wide job queue health