--- checkId: check.agent.task.backlog plugin: stellaops.doctor.agent severity: warn tags: [agent, task, queue, capacity] --- # Task Queue Backlog ## What It Checks Monitors the pending task queue depth across the agent fleet to detect capacity issues. The check is designed to evaluate: 1. Total queued tasks across the entire fleet 2. Age of the oldest queued task (how long tasks wait before dispatch) 3. Queue growth rate trend (growing, stable, or draining) **Current status:** implementation pending -- the check always returns Pass with a placeholder message. The `CanRun` method always returns true. ## Why It Matters A growing task backlog means agents cannot keep up with incoming work. Tasks age in the queue, SLA timers expire, and users experience delayed deployments and scan results. If the backlog grows unchecked, it can cascade: delayed scans block policy gates, which block promotions, which block release trains. Detecting backlog growth early allows operators to scale the fleet or prioritize the queue. ## Common Causes - Insufficient agent count for current workload - One or more agents offline, reducing effective fleet capacity - Task burst from bulk operations (mass rescans, environment-wide deployments) - Slow tasks monopolizing agent slots (large image scans, complex builds) - Task dispatch paused due to configuration or freeze window ## How to Fix ### Docker Compose ```bash # Check current queue depth docker compose -f devops/compose/docker-compose.stella-ops.yml exec agent \ stella agent tasks --status queued --count # Scale agents to reduce backlog docker compose -f devops/compose/docker-compose.stella-ops.yml up -d --scale agent=3 # Increase concurrent task limit per agent # Set environment variable in compose override: # AGENT__MAXCONCURRENTTASKS=8 ``` ### Bare Metal / systemd ```bash # Check queue depth and oldest task stella agent tasks --status queued # Increase concurrent task limit stella agent config --agent-id --set max_concurrent_tasks=8 # Add more agents to the fleet stella agent bootstrap --name agent-03 --env production --platform linux ``` ### Kubernetes / Helm ```bash # Check queue depth kubectl exec -it deploy/stellaops-agent -n stellaops -- \ stella agent tasks --status queued --count # Scale agent deployment kubectl scale deployment stellaops-agent --replicas=5 -n stellaops # Or use HPA for auto-scaling # agent: # autoscaling: # enabled: true # minReplicas: 2 # maxReplicas: 10 # targetCPUUtilizationPercentage: 70 helm upgrade stellaops stellaops/stellaops -f values.yaml ``` ## Verification ``` stella doctor run --check check.agent.task.backlog ``` ## Related Checks - `check.agent.capacity` -- backlog grows when capacity is insufficient - `check.agent.task.failure.rate` -- failed tasks may be re-queued, inflating the backlog - `check.agent.resource.utilization` -- saturated agents process tasks slowly - `check.agent.heartbeat.freshness` -- offline agents reduce dispatch targets