4.5 KiB
Runbook: Scanner - Registry Authentication Failures
Sprint: SPRINT_20260117_029_DOCS_runbook_coverage Task: RUN-002 - Scanner Runbooks
Metadata
| Field | Value |
|---|---|
| Component | Scanner |
| Severity | High |
| On-call scope | Platform team, Security team |
| Last updated | 2026-01-17 |
| Doctor check | check.scanner.registry-auth |
Symptoms
- Scans failing with "401 Unauthorized" or "403 Forbidden"
- Alert
ScannerRegistryAuthFailedfiring - Error: "failed to authenticate with registry"
- Error: "failed to pull image manifest"
- Scans work for public images but fail for private images
Impact
| Impact Type | Description |
|---|---|
| User-facing | Cannot scan private images; release pipeline blocked |
| Data integrity | No data loss; authentication issue only |
| SLA impact | All scans for affected registry blocked |
Diagnosis
Quick checks
-
Check Doctor diagnostics:
stella doctor --check check.scanner.registry-auth -
List configured registries:
stella registry list --show-statusLook for: Registries with "auth_failed" status
-
Test registry authentication:
stella registry test <registry-url>
Deep diagnosis
-
Check credential expiration:
stella registry credentials show <registry-name>Look for: Expiration date, token type
-
Test with verbose output:
stella registry test <registry-url> --verboseLook for: Specific auth error message, HTTP status code
-
Check registry logs:
stella scanner logs --filter "registry auth" --last 30m -
Verify IAM/OIDC configuration (for cloud registries):
stella registry iam-status <registry-name>Problem if: IAM role not assumable, OIDC token expired
Resolution
Immediate mitigation
-
Refresh credentials (for token-based auth):
stella registry refresh-credentials <registry-name> -
Update static credentials:
stella registry update-credentials <registry-name> \ --username <user> \ --password <token> -
For Docker Hub rate limiting:
stella registry configure docker-hub \ --username <user> \ --access-token <token>
Root cause fix
If credentials expired:
-
Generate new access token in registry (ECR, GCR, ACR, etc.)
-
Update credentials:
stella registry update-credentials <registry-name> --from-env -
Configure automatic token refresh:
stella registry config set <registry-name>.auto_refresh true stella registry config set <registry-name>.refresh_interval 11h
If IAM role/policy changed (AWS ECR):
-
Verify IAM role permissions:
stella registry iam verify <registry-name> -
Update IAM role ARN if changed:
stella registry configure ecr \ --region <region> \ --role-arn <arn>
If OIDC federation changed (GCP Artifact Registry):
-
Verify service account:
stella registry oidc verify <registry-name> -
Update workload identity configuration:
stella registry configure gcr \ --project <project> \ --workload-identity-provider <provider>
If certificate changed (self-hosted registries):
-
Update CA certificate:
stella registry configure <registry-name> \ --ca-cert /path/to/ca.crt -
Or skip verification (not recommended for production):
stella registry configure <registry-name> \ --insecure-skip-verify
Verification
# Test authentication
stella registry test <registry-url>
# Test scanning a private image
stella scan image --image <registry-url>/<image>:<tag> --dry-run
# Verify no auth failures in recent logs
stella scanner logs --filter "auth" --level error --last 30m
Prevention
- Credentials: Use service accounts/workload identity instead of static tokens
- Rotation: Configure automatic token refresh before expiration
- Monitoring: Alert on authentication failure rate > 0
- Documentation: Document registry credential management procedures
Related Resources
- Architecture:
docs/modules/scanner/registry-auth.md - Related runbooks:
scanner-worker-stuck.md,scanner-timeout.md - Registry setup:
docs/operations/registry-configuration.md