# Runbook: Scanner - Scan Timeout on Complex Images > **Sprint:** SPRINT_20260117_029_DOCS_runbook_coverage > **Task:** RUN-002 - Scanner Runbooks ## Metadata | Field | Value | |-------|-------| | **Component** | Scanner | | **Severity** | Medium | | **On-call scope** | Platform team | | **Last updated** | 2026-01-17 | | **Doctor check** | `check.scanner.timeout-rate` | --- ## Symptoms - [ ] Scans failing with "timeout exceeded" error - [ ] Alert `ScannerTimeoutExceeded` firing - [ ] Metric `scanner_scan_timeout_total` increasing - [ ] Specific images consistently timing out - [ ] Error log: "scan operation exceeded timeout of X seconds" --- ## Impact | Impact Type | Description | |-------------|-------------| | **User-facing** | Specific images cannot be scanned; pipeline blocked | | **Data integrity** | No data loss; scans can be retried with adjusted settings | | **SLA impact** | Release pipeline delayed for affected images | --- ## Diagnosis ### Quick checks 1. **Check Doctor diagnostics:** ```bash stella doctor --check check.scanner.timeout-rate ``` 2. **Identify failing images:** ```bash stella scanner jobs list --status timeout --last 1h ``` Look for: Pattern in image types or sizes 3. **Check current timeout settings:** ```bash stella scanner config get timeouts ``` ### Deep diagnosis 1. **Analyze image complexity:** ```bash stella image inspect --format json | jq '{size, layers: .layers | length, files: .manifest.fileCount}' ``` Problem if: > 50 layers, > 100k files, or > 5GB size 2. **Check scanner worker load:** ```bash stella scanner workers stats ``` Problem if: All workers at capacity during timeouts 3. **Profile a scan:** ```bash stella scan image --image --profile --verbose ``` Look for: Which phase is slowest (layer extraction, SBOM generation, vuln matching) 4. **Check for filesystem-heavy images:** ```bash stella image layers --sort-by file-count ``` Problem if: Single layer with > 50k files (e.g., node_modules) --- ## Resolution ### Immediate mitigation 1. **Increase timeout for specific image:** ```bash stella scan image --image --timeout 30m ``` 2. **Increase global scan timeout:** ```bash stella scanner config set timeouts.scan 20m stella scanner workers restart ``` 3. **Enable fast mode for initial scan:** ```bash stella scan image --image --fast-mode ``` ### Root cause fix **If image is too complex:** 1. Enable incremental scanning: ```bash stella scanner config set scan.incremental_mode true ``` 2. Configure layer caching: ```bash stella scanner config set cache.layer_dedup true stella scanner config set cache.sbom_cache true ``` **If filesystem is too large:** 1. Enable streaming SBOM generation: ```bash stella scanner config set sbom.streaming_threshold 500Gi ``` 2. Configure file sampling for massive images: ```bash stella scanner config set sbom.file_sample_max 100000 ``` **If vulnerability matching is slow:** 1. Enable parallel matching: ```bash stella scanner config set vuln.parallel_matching true stella scanner config set vuln.match_workers 4 ``` 2. Optimize vulnerability database indexes: ```bash stella db optimize --component scanner ``` ### Verification ```bash # Retry the previously failing scan stella scan image --image --timeout 30m # Monitor scan progress stella scanner jobs watch # Verify no timeouts in recent scans stella scanner jobs list --status timeout --last 1h ``` --- ## Prevention - [ ] **Capacity:** Configure appropriate timeouts based on expected image complexity (15m default, 30m for large) - [ ] **Monitoring:** Alert on timeout rate > 5% - [ ] **Caching:** Enable layer and SBOM caching for base images - [ ] **Documentation:** Document image size/complexity limits in user guide --- ## Related Resources - **Architecture:** `docs/modules/scanner/architecture.md` - **Related runbooks:** `scanner-oom.md`, `scanner-worker-stuck.md` - **Dashboard:** Grafana > Stella Ops > Scanner Performance