849 lines
22 KiB
Markdown
849 lines
22 KiB
Markdown
# Unknowns Queue Management Runbook
|
||
|
||
> **Version**: 1.0.0
|
||
> **Sprint**: 3500.0004.0004
|
||
> **Last Updated**: 2025-12-20
|
||
|
||
This runbook covers operational procedures for managing the Unknowns queue, including triage, escalation, resolution, and queue health maintenance.
|
||
|
||
---
|
||
|
||
## Table of Contents
|
||
|
||
1. [Overview](#1-overview)
|
||
2. [Queue Operations](#2-queue-operations)
|
||
3. [Triage Procedures](#3-triage-procedures)
|
||
4. [Escalation Workflows](#4-escalation-workflows)
|
||
5. [Resolution Procedures](#5-resolution-procedures)
|
||
6. [Troubleshooting](#6-troubleshooting)
|
||
7. [Monitoring & Alerting](#7-monitoring--alerting)
|
||
|
||
---
|
||
|
||
## 1. Overview
|
||
|
||
### What are Unknowns?
|
||
|
||
Unknowns are items that could not be fully classified during scanning due to:
|
||
|
||
- Missing VEX statements
|
||
- Ambiguous indirect calls in call graphs
|
||
- Incomplete SBOM data
|
||
- Missing advisory information
|
||
- Conflicting evidence from multiple sources
|
||
|
||
### Unknown Ranking
|
||
|
||
Unknowns are ranked using a 2-factor scoring model:
|
||
|
||
```
|
||
score = 0.60 × blast + 0.30 × scarcity + 0.30 × pressure + containment_deduction
|
||
```
|
||
|
||
| Factor | Weight | Description |
|
||
|--------|--------|-------------|
|
||
| Blast Radius | 0.60 | Impact scope (dependents, network exposure) |
|
||
| Evidence Scarcity | 0.30 | How much data is missing |
|
||
| Exploit Pressure | 0.30 | EPSS score, KEV status |
|
||
| Containment | -0.20 | Mitigation factors (seccomp, read-only FS) |
|
||
|
||
### Band Assignment
|
||
|
||
| Band | Score Range | Priority | SLA |
|
||
|------|-------------|----------|-----|
|
||
| HOT | ≥ 0.70 | Critical | 24 hours |
|
||
| WARM | 0.40 - 0.69 | Normal | 7 days |
|
||
| COLD | < 0.40 | Low | 30 days |
|
||
|
||
---
|
||
|
||
## 2. Queue Operations
|
||
|
||
### 2.1 View Queue Status
|
||
|
||
```bash
|
||
# Get queue summary
|
||
stella unknowns summary
|
||
|
||
# Output:
|
||
# Total: 142 unknowns
|
||
# HOT: 12 (8%) - Requires immediate attention
|
||
# WARM: 85 (60%) - Normal priority
|
||
# COLD: 45 (32%) - Low priority
|
||
#
|
||
# KEV items: 3
|
||
# Average score: 0.52
|
||
|
||
# Get queue summary via API
|
||
curl "https://scanner.example.com/api/v1/unknowns/summary" \
|
||
-H "Authorization: Bearer $TOKEN"
|
||
```
|
||
|
||
### 2.2 List Unknowns
|
||
|
||
```bash
|
||
# List all HOT unknowns
|
||
stella unknowns list --band HOT
|
||
|
||
# List by score (highest first)
|
||
stella unknowns list --sort score --order desc --limit 20
|
||
|
||
# Filter by reason
|
||
stella unknowns list --reason missing_vex
|
||
|
||
# Filter by artifact
|
||
stella unknowns list --artifact sha256:abc123...
|
||
|
||
# Filter by KEV status
|
||
stella unknowns list --kev true
|
||
```
|
||
|
||
### 2.3 View Unknown Details
|
||
|
||
```bash
|
||
# Get detailed view
|
||
stella unknowns show unk-12345678-abcd-1234-5678-abcdef123456
|
||
|
||
# Output:
|
||
# ID: unk-12345678-...
|
||
# Artifact: pkg:oci/myapp@sha256:abc123
|
||
# Reasons: [missing_vex, ambiguous_indirect_call]
|
||
#
|
||
# Blast Radius:
|
||
# Dependents: 15 services
|
||
# Network: internet-facing
|
||
# Privilege: user
|
||
#
|
||
# Evidence Scarcity: 0.7 (high)
|
||
#
|
||
# Exploit Pressure:
|
||
# EPSS: 0.45
|
||
# KEV: false
|
||
#
|
||
# Containment:
|
||
# Seccomp: enforced (-0.10)
|
||
# Filesystem: read-only (-0.10)
|
||
#
|
||
# Score: 0.62 (WARM band)
|
||
# Score Breakdown:
|
||
# Blast component: +0.35
|
||
# Scarcity component: +0.21
|
||
# Pressure component: +0.26
|
||
# Containment deduction: -0.20
|
||
|
||
# Show proof tree
|
||
stella unknowns proof unk-12345678-...
|
||
```
|
||
|
||
### 2.4 Export Queue Data
|
||
|
||
```bash
|
||
# Export for analysis
|
||
stella unknowns export --format json --output unknowns.json
|
||
|
||
# Export HOT items for daily review
|
||
stella unknowns export --band HOT --format csv --output hot-unknowns.csv
|
||
|
||
# Export with full details
|
||
stella unknowns export --verbose --include-proofs --output full-export.json
|
||
```
|
||
|
||
---
|
||
|
||
## 3. Triage Procedures
|
||
|
||
### 3.1 Daily Triage Workflow
|
||
|
||
**Schedule**: Daily at 9:00 AM
|
||
|
||
**Duration**: 30 minutes
|
||
|
||
**Participants**: Security analyst, on-call engineer
|
||
|
||
**Process**:
|
||
|
||
```bash
|
||
# 1. Get today's queue snapshot
|
||
stella unknowns snapshot --output daily-$(date +%Y%m%d).json
|
||
|
||
# 2. Review all HOT items
|
||
stella unknowns list --band HOT --since 24h
|
||
|
||
# 3. For each HOT unknown, determine action:
|
||
# - Escalate: Trigger immediate rescan
|
||
# - Investigate: Needs manual analysis
|
||
# - Defer: Move to WARM (with justification)
|
||
# - Resolve: Evidence found, can close
|
||
|
||
# 4. Process each item
|
||
stella unknowns triage unk-12345678-... --action escalate
|
||
stella unknowns triage unk-87654321-... --action investigate --notes "Need VEX from vendor"
|
||
stella unknowns triage unk-11111111-... --action defer --reason "False positive suspected"
|
||
```
|
||
|
||
### 3.2 Triage Decision Matrix
|
||
|
||
| Reason Code | KEV | EPSS > 0.5 | Action |
|
||
|-------------|-----|------------|--------|
|
||
| `missing_vex` | Yes | Any | Escalate + Vendor outreach |
|
||
| `missing_vex` | No | Yes | Escalate |
|
||
| `missing_vex` | No | No | Request VEX |
|
||
| `ambiguous_indirect_call` | Any | Any | Manual code review |
|
||
| `incomplete_sbom` | Any | Any | Rescan with updated extractor |
|
||
| `conflicting_evidence` | Any | Any | Manual analysis |
|
||
|
||
### 3.3 Triage Templates
|
||
|
||
```bash
|
||
# Quick escalate (HOT + KEV)
|
||
stella unknowns triage unk-... --action escalate \
|
||
--priority P1 \
|
||
--notes "KEV item, requires immediate attention"
|
||
|
||
# Request vendor VEX
|
||
stella unknowns triage unk-... --action investigate \
|
||
--notes "Requested VEX from vendor via security@vendor.com" \
|
||
--due-date 7d
|
||
|
||
# Mark for code review
|
||
stella unknowns triage unk-... --action investigate \
|
||
--notes "Requires manual code review to resolve indirect call" \
|
||
--assign @code-review-team
|
||
|
||
# Defer with justification
|
||
stella unknowns triage unk-... --action defer \
|
||
--reason "Component not deployed to production" \
|
||
--evidence "deployment-manifest.yaml shows staging-only"
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Escalation Workflows
|
||
|
||
### 4.1 Automatic Escalation
|
||
|
||
Unknowns are automatically escalated when:
|
||
|
||
- Score increases above HOT threshold (0.70)
|
||
- KEV status added to related CVE
|
||
- EPSS score increases significantly (> 0.2 delta)
|
||
- Blast radius increases (new dependents detected)
|
||
|
||
**Configure auto-escalation**:
|
||
|
||
```yaml
|
||
# policy.unknowns.escalation.yaml
|
||
autoEscalation:
|
||
enabled: true
|
||
triggers:
|
||
- condition: score >= 0.70
|
||
action: escalate
|
||
notify: [security-team]
|
||
- condition: kev == true
|
||
action: escalate
|
||
priority: P1
|
||
notify: [security-team, management]
|
||
- condition: epss_delta > 0.2
|
||
action: escalate
|
||
notify: [security-team]
|
||
```
|
||
|
||
### 4.2 Manual Escalation
|
||
|
||
```bash
|
||
# Escalate via CLI
|
||
stella unknowns escalate unk-12345678-...
|
||
|
||
# Escalate with reason
|
||
stella unknowns escalate unk-12345678-... \
|
||
--reason "Customer reported potential exploit"
|
||
|
||
# Escalate to trigger rescan
|
||
stella unknowns escalate unk-12345678-... --rescan
|
||
|
||
# Output:
|
||
# Escalated: unk-12345678-...
|
||
# Rescan job: rescan-job-001
|
||
# Status: queued
|
||
# ETA: 5 minutes
|
||
```
|
||
|
||
### 4.3 Bulk Escalation
|
||
|
||
```bash
|
||
# Escalate all KEV items
|
||
stella unknowns escalate --filter "kev=true" --reason "KEV bulk escalation"
|
||
|
||
# Escalate high-score items
|
||
stella unknowns escalate --filter "score>=0.8" --rescan
|
||
|
||
# Escalate by artifact
|
||
stella unknowns escalate --artifact sha256:abc123... --reason "Production incident"
|
||
```
|
||
|
||
### 4.4 Escalation SLA Tracking
|
||
|
||
```bash
|
||
# Check SLA status
|
||
stella unknowns sla-status
|
||
|
||
# Output:
|
||
# HOT unknowns SLA (24h):
|
||
# In SLA: 10 (83%)
|
||
# Breached: 2 (17%)
|
||
#
|
||
# Breached items:
|
||
# unk-111... (26h old) - missing_vex
|
||
# unk-222... (30h old) - conflicting_evidence
|
||
|
||
# Get SLA breach notifications
|
||
stella unknowns list --sla-breached
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Resolution Procedures
|
||
|
||
### 5.1 Resolution Types
|
||
|
||
| Resolution | Description | Evidence Required |
|
||
|------------|-------------|-------------------|
|
||
| `not_affected` | Vulnerability doesn't apply | VEX statement or manual analysis |
|
||
| `fixed` | Vulnerability patched | Version upgrade confirmation |
|
||
| `mitigated` | Controls in place | Mitigation documentation |
|
||
| `false_positive` | Incorrect classification | Analysis report |
|
||
| `wont_fix` | Accepted risk | Risk acceptance form |
|
||
|
||
### 5.2 Resolve Unknown
|
||
|
||
```bash
|
||
# Resolve as not affected
|
||
stella unknowns resolve unk-12345678-... \
|
||
--resolution not_affected \
|
||
--justification "vulnerable_code_not_present" \
|
||
--notes "Manual code review confirmed function not used"
|
||
|
||
# Resolve as fixed
|
||
stella unknowns resolve unk-12345678-... \
|
||
--resolution fixed \
|
||
--justification "version_upgraded" \
|
||
--evidence "Upgraded lodash to 4.17.21, CVE patched"
|
||
|
||
# Resolve as mitigated
|
||
stella unknowns resolve unk-12345678-... \
|
||
--resolution mitigated \
|
||
--justification "inline_mitigations_exist" \
|
||
--evidence "WAF rule WAF-001 blocks exploit pattern"
|
||
|
||
# Resolve as won't fix (risk accepted)
|
||
stella unknowns resolve unk-12345678-... \
|
||
--resolution wont_fix \
|
||
--justification "risk_accepted" \
|
||
--evidence "Risk acceptance ticket RISK-123" \
|
||
--expires 90d # Re-evaluate in 90 days
|
||
```
|
||
|
||
### 5.3 Bulk Resolution
|
||
|
||
```bash
|
||
# Resolve all items for a fixed package version
|
||
stella unknowns resolve-batch \
|
||
--filter "purl=pkg:npm/lodash@4.17.20" \
|
||
--resolution fixed \
|
||
--justification "Upgraded to 4.17.21 fleet-wide" \
|
||
--evidence "Fleet upgrade ticket FLEET-456"
|
||
|
||
# Resolve false positives from analysis
|
||
stella unknowns resolve-batch \
|
||
--file false-positives.json \
|
||
--resolution false_positive
|
||
```
|
||
|
||
### 5.4 Resolution Audit Trail
|
||
|
||
```bash
|
||
# View resolution history
|
||
stella unknowns history unk-12345678-...
|
||
|
||
# Output:
|
||
# 2025-12-15 10:00:00 - Created (score: 0.62)
|
||
# 2025-12-16 09:30:00 - Triaged by analyst@example.com
|
||
# 2025-12-17 14:00:00 - Escalated (KEV added)
|
||
# 2025-12-18 11:00:00 - Resolved by security@example.com
|
||
# Resolution: not_affected
|
||
# Justification: vulnerable_code_not_present
|
||
# Notes: Manual code review confirmed function not used
|
||
|
||
# Export audit trail
|
||
stella unknowns audit-export --from 2025-01-01 --to 2025-12-31 --output audit.json
|
||
```
|
||
|
||
---
|
||
|
||
## 6. Troubleshooting
|
||
|
||
### 6.1 Score Seems Wrong
|
||
|
||
**Symptom**: Unknown scored too high or too low.
|
||
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# View score breakdown
|
||
stella unknowns show unk-... --score-details
|
||
|
||
# View proof tree
|
||
stella unknowns proof unk-... --verbose
|
||
```
|
||
|
||
**Common causes**:
|
||
|
||
1. **Stale EPSS data**: EPSS feed not updated
|
||
2. **Incorrect blast radius**: Dependency data outdated
|
||
3. **Missing containment data**: Seccomp/filesystem status unknown
|
||
|
||
**Resolution**:
|
||
|
||
```bash
|
||
# Trigger score recalculation
|
||
stella unknowns recalculate unk-...
|
||
|
||
# Force refresh of all input signals
|
||
stella unknowns refresh unk-... --force
|
||
```
|
||
|
||
### 6.2 Duplicate Unknowns
|
||
|
||
**Symptom**: Same issue appears multiple times.
|
||
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# Find potential duplicates
|
||
stella unknowns duplicates --scan
|
||
|
||
# Output shows items with same CVE+PURL but different artifacts
|
||
```
|
||
|
||
**Resolution**:
|
||
|
||
```bash
|
||
# Merge duplicates
|
||
stella unknowns merge \
|
||
--primary unk-111... \
|
||
--secondary unk-222... \
|
||
--reason "Same CVE across artifact versions"
|
||
```
|
||
|
||
### 6.3 Escalation Not Working
|
||
|
||
**Symptom**: Escalation doesn't trigger rescan.
|
||
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# Check escalation status
|
||
stella unknowns escalation-status unk-...
|
||
|
||
# Check Scheduler connectivity
|
||
stella health check --service scheduler
|
||
|
||
# Check job queue
|
||
stella scheduler queue status rescan
|
||
```
|
||
|
||
**Resolution**:
|
||
|
||
```bash
|
||
# Retry escalation
|
||
stella unknowns escalate unk-... --force
|
||
|
||
# Manual rescan trigger
|
||
stella scan trigger --artifact sha256:abc123... --priority high
|
||
```
|
||
|
||
### 6.4 Resolution Rejected
|
||
|
||
**Symptom**: Resolution attempt fails validation.
|
||
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# Check resolution requirements
|
||
stella unknowns resolution-requirements unk-...
|
||
|
||
# Output:
|
||
# Resolution requirements for unk-12345678-...
|
||
# - Justification: required
|
||
# - Evidence: required (reason: KEV item)
|
||
# - Approver: required (band: HOT)
|
||
```
|
||
|
||
**Resolution**:
|
||
|
||
```bash
|
||
# Provide required evidence
|
||
stella unknowns resolve unk-... \
|
||
--resolution not_affected \
|
||
--justification "vulnerable_code_not_present" \
|
||
--evidence "Code review: CRV-123" \
|
||
--approver security-lead@example.com
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Monitoring & Alerting
|
||
|
||
> **Updated**: Sprint SPRINT_20260118_018_Unknowns_queue_enhancement (UQ-007)
|
||
|
||
### 7.1 Key Metrics
|
||
|
||
| Metric | Description | Alert Threshold |
|
||
|--------|-------------|-----------------|
|
||
| `unknowns_queue_depth_hot` | HOT band queue depth | > 5 critical, > 0 for 1h warning |
|
||
| `unknowns_queue_depth_warm` | WARM band queue depth | > 25 warning |
|
||
| `unknowns_queue_depth_cold` | COLD band queue depth | > 100 warning |
|
||
| `unknowns_sla_compliance` | SLA compliance rate (0-1) | < 0.80 critical, < 0.95 warning |
|
||
| `unknowns_sla_breach_total` | Total SLA breaches (counter) | increase > 0 |
|
||
| `unknowns_escalated_total` | Escalations (counter) | rate > 10/hour |
|
||
| `unknowns_demoted_total` | Demotions (counter) | - |
|
||
| `unknowns_expired_total` | Expirations (counter) | - |
|
||
| `unknowns_processing_time_seconds` | Processing time histogram | p95 > 30s |
|
||
| `unknowns_resolution_time_hours` | Resolution time by band | p95 > SLA |
|
||
| `unknowns_state_transitions_total` | State transitions (by from/to) | - |
|
||
| `greyqueue_stuck_total` | Stuck processing entries | > 0 |
|
||
| `greyqueue_timeout_total` | Processing timeouts | > 5/hour |
|
||
| `greyqueue_processing_count` | Currently processing | > 10 for 30m |
|
||
|
||
### 7.2 Grafana Dashboard
|
||
|
||
Import dashboard from: `devops/observability/grafana/dashboards/unknowns-queue-dashboard.json`
|
||
|
||
**Dashboard Panels:**
|
||
|
||
| Panel | Description |
|
||
|-------|-------------|
|
||
| Total Queue Depth | Stat showing total across all bands |
|
||
| HOT/WARM/COLD Unknowns | Individual band stats with thresholds |
|
||
| SLA Compliance | Gauge showing compliance percentage |
|
||
| Queue Depth Over Time | Time series by band |
|
||
| SLA Compliance Over Time | Trending compliance |
|
||
| State Transitions | Rate of state changes |
|
||
| Processing Time (p95) | Performance histogram |
|
||
| Escalations & Failures | Lifecycle events |
|
||
| Resolution Time by Band | Time-to-resolution |
|
||
| Stuck & Timeout Events | Watchdog metrics |
|
||
| SLA Breaches Today | 24h breach counter |
|
||
|
||
### 7.3 Alerting Rules
|
||
|
||
Alert rules deployed from: `devops/observability/prometheus/rules/unknowns-queue-alerts.yaml`
|
||
|
||
**Critical Alerts:**
|
||
|
||
| Alert | Condition | Response |
|
||
|-------|-----------|----------|
|
||
| `UnknownsSlaBreachCritical` | compliance < 80% | Immediate escalation to security team |
|
||
| `UnknownsHotQueueHigh` | HOT > 5 for 10m | Prioritize resolution |
|
||
| `UnknownsProcessingFailures` | Failed entries in 1h | Manual intervention required |
|
||
| `UnknownsSlaMonitorDown` | No metrics for 5m | Check service health |
|
||
| `UnknownsHealthCheckUnhealthy` | Health check failing | Check SLA breaches |
|
||
|
||
**Warning Alerts:**
|
||
|
||
| Alert | Condition | Response |
|
||
|-------|-----------|----------|
|
||
| `UnknownsSlaBreachWarning` | 80% ≤ compliance < 95% | Review queue health |
|
||
| `UnknownsHotQueuePresent` | HOT > 0 for 1h | Check progress |
|
||
| `UnknownsQueueBacklog` | Total > 100 for 30m | Scale processing |
|
||
| `UnknownsStuckProcessing` | Processing > 10 for 30m | Check bottlenecks |
|
||
| `UnknownsProcessingTimeout` | Timeouts > 5/hour | Review automation |
|
||
| `UnknownsEscalationRate` | Escalations > 10/hour | Review criteria |
|
||
|
||
### 7.4 Metric-Based Troubleshooting
|
||
|
||
#### SLA Breach Investigation
|
||
|
||
```bash
|
||
# 1. Check current breach status
|
||
curl -s "http://prometheus:9090/api/v1/query?query=unknowns_sla_compliance" | jq
|
||
|
||
# 2. Identify breached entries
|
||
curl -s "$UNKNOWNS_API/grey-queue?status=pending" | \
|
||
jq '.items[] | select(.sla_breached == true)'
|
||
|
||
# 3. Check SLA health endpoint
|
||
curl -s "$UNKNOWNS_API/health/sla" | jq
|
||
|
||
# 4. Review breach timeline
|
||
# In Grafana: SLA Compliance Over Time panel, last 24h
|
||
```
|
||
|
||
#### Stuck Processing Investigation
|
||
|
||
```bash
|
||
# 1. Check processing count
|
||
curl -s "http://prometheus:9090/api/v1/query?query=greyqueue_processing_count" | jq
|
||
|
||
# 2. List stuck entries
|
||
curl -s "$UNKNOWNS_API/grey-queue?status=Processing" | \
|
||
jq '.items[] | select((.last_processed_at | fromdateiso8601) < (now - 3600))'
|
||
|
||
# 3. Check watchdog metrics
|
||
curl -s "http://prometheus:9090/api/v1/query?query=rate(greyqueue_stuck_total[1h])" | jq
|
||
|
||
# 4. Force retry if needed
|
||
curl -X POST "$UNKNOWNS_API/grey-queue/{id}/retry"
|
||
```
|
||
|
||
#### High Escalation Rate
|
||
|
||
```bash
|
||
# 1. Check escalation rate
|
||
curl -s "http://prometheus:9090/api/v1/query?query=rate(unknowns_escalated_total[1h])" | jq
|
||
|
||
# 2. Review escalation reasons
|
||
curl -s "$UNKNOWNS_API/grey-queue?status=Escalated" | \
|
||
jq 'group_by(.escalation_reason) | map({reason: .[0].escalation_reason, count: length})'
|
||
|
||
# 3. Check for EPSS/KEV spikes
|
||
# Events triggering escalations:
|
||
# - epss.updated with score increase
|
||
# - kev.added events
|
||
# - deployment.created with affected components
|
||
```
|
||
|
||
#### Queue Growth Analysis
|
||
|
||
```bash
|
||
# 1. Check inflow rate
|
||
curl -s "http://prometheus:9090/api/v1/query?query=rate(unknowns_enqueued_total[1h])" | jq
|
||
|
||
# 2. Check resolution rate
|
||
curl -s "http://prometheus:9090/api/v1/query?query=rate(unknowns_resolved_total[1h])" | jq
|
||
|
||
# 3. Calculate net growth
|
||
# growth_rate = inflow_rate - resolution_rate
|
||
|
||
# 4. Review reasons for new unknowns
|
||
curl -s "$UNKNOWNS_API/grey-queue/summary" | jq '.by_reason'
|
||
```
|
||
|
||
### 7.5 Daily Report
|
||
|
||
```bash
|
||
# Generate daily report
|
||
stella unknowns report --format email --send-to security-team@example.com
|
||
|
||
# Report includes:
|
||
# - Queue summary (total, by band, by reason)
|
||
# - SLA status (in compliance, breaches)
|
||
# - Top 10 highest-scored items
|
||
# - Newly added items (last 24h)
|
||
# - Resolved items (last 24h)
|
||
# - KEV item status
|
||
# - Trends (7-day, 30-day)
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Unknown Budgets
|
||
|
||
Unknown budgets enforce per-environment caps on unknowns by reason code. Budgets can warn or block when exceeded.
|
||
|
||
**Configuration**:
|
||
|
||
```yaml
|
||
# etc/policy.unknowns.budgets.yaml
|
||
unknownBudgets:
|
||
enforceBudgets: true
|
||
budgets:
|
||
prod:
|
||
environment: prod
|
||
totalLimit: 3
|
||
reasonLimits:
|
||
Reachability: 0
|
||
Provenance: 0
|
||
VexConflict: 1
|
||
action: Block
|
||
exceededMessage: "Production requires zero reachability unknowns"
|
||
|
||
stage:
|
||
environment: stage
|
||
totalLimit: 10
|
||
reasonLimits:
|
||
Reachability: 1
|
||
action: WarnUnlessException
|
||
|
||
dev:
|
||
environment: dev
|
||
totalLimit: null
|
||
action: Warn
|
||
|
||
default:
|
||
environment: default
|
||
totalLimit: 5
|
||
action: Warn
|
||
```
|
||
|
||
**Exception coverage**:
|
||
|
||
To allow approved exceptions to cover specific unknown reason codes, set exception metadata
|
||
`unknown_reason_codes` (comma-separated). Example: `Reachability, U-VEX`.
|
||
|
||
---
|
||
|
||
## Related Documentation
|
||
|
||
- [Unknowns API Reference](../api/score-proofs-reachability-api-reference.md#5-unknowns-api)
|
||
- [Triage Technical Reference](../product/advisories/14-Dec-2025%20-%20Triage%20and%20Unknowns%20Technical%20Reference.md)
|
||
- [Score Proofs Runbook](./score-proofs-runbook.md)
|
||
- [Policy Engine](../modules/policy/architecture.md)
|
||
- [Determinization API](../modules/policy/determinization-api.md)
|
||
- [VEX Consensus Guide](../VEX_CONSENSUS_GUIDE.md)
|
||
|
||
---
|
||
|
||
## 8. Grey Queue Operations
|
||
|
||
> **Sprint**: SPRINT_20260112_010_CLI_unknowns_grey_queue_cli
|
||
|
||
The Grey Queue handles observations with uncertain status requiring operator attention or additional evidence. These are distinct from standard HOT/WARM/COLD band unknowns.
|
||
|
||
### 8.1 Grey Queue Overview
|
||
|
||
Grey Queue items have:
|
||
- **Observation state**: `PendingDeterminization`, `Disputed`, or `GuardedPass`
|
||
- **Reanalysis fingerprint**: Deterministic ID for reproducible replays
|
||
- **Triggers**: Events that caused reanalysis
|
||
- **Conflicts**: Detected evidence disagreements
|
||
- **Next actions**: Suggested resolution paths
|
||
|
||
### 8.2 List Grey Queue Items
|
||
|
||
```bash
|
||
# List all grey queue items
|
||
stella unknowns list --state grey
|
||
|
||
# List by observation state
|
||
stella unknowns list --observation-state pending-determinization
|
||
stella unknowns list --observation-state disputed
|
||
stella unknowns list --observation-state guarded-pass
|
||
|
||
# List with fingerprint details
|
||
stella unknowns list --state grey --show-fingerprint
|
||
|
||
# List with conflict summary
|
||
stella unknowns list --state grey --show-conflicts
|
||
```
|
||
|
||
### 8.3 View Grey Queue Details
|
||
|
||
```bash
|
||
# Show grey queue item with full details
|
||
stella unknowns show unk-12345678-... --grey
|
||
|
||
# Output:
|
||
# ID: unk-12345678-...
|
||
# Observation State: Disputed
|
||
#
|
||
# Reanalysis Fingerprint:
|
||
# ID: sha256:abc123...
|
||
# Computed At: 2026-01-15T10:00:00Z
|
||
# Policy Config Hash: sha256:def456...
|
||
#
|
||
# Triggers (2):
|
||
# - epss.updated@1 (2026-01-15T09:55:00Z) delta=0.15
|
||
# - vex.updated@1 (2026-01-15T09:50:00Z)
|
||
#
|
||
# Conflicts (1):
|
||
# - VexStatusConflict: vendor-a reports 'not_affected', vendor-b reports 'affected'
|
||
# Severity: high
|
||
# Adjudication: manual_review
|
||
#
|
||
# Next Actions:
|
||
# - trust_resolution: Resolve issuer trust conflict
|
||
# - manual_review: Escalate to security team
|
||
|
||
# Show fingerprint only
|
||
stella unknowns fingerprint unk-12345678-...
|
||
|
||
# Show triggers only
|
||
stella unknowns triggers unk-12345678-...
|
||
```
|
||
|
||
### 8.4 Grey Queue Triage Actions
|
||
|
||
```bash
|
||
# Resolve a grey queue item (operator determination)
|
||
stella unknowns resolve unk-12345678-... \
|
||
--status not_affected \
|
||
--justification "Verified vendor VEX is authoritative" \
|
||
--evidence-ref "vex-observation-id-123"
|
||
|
||
# Escalate for manual review
|
||
stella unknowns escalate unk-12345678-... \
|
||
--priority P1 \
|
||
--reason "Conflicting VEX requires security team decision"
|
||
|
||
# Defer pending additional evidence
|
||
stella unknowns defer unk-12345678-... \
|
||
--await vex \
|
||
--reason "Waiting for upstream vendor VEX statement"
|
||
```
|
||
|
||
### 8.5 Grey Queue Conflict Resolution
|
||
|
||
```bash
|
||
# List items with conflicts
|
||
stella unknowns list --has-conflicts
|
||
|
||
# Filter by conflict type
|
||
stella unknowns list --conflict-type vex-status-conflict
|
||
stella unknowns list --conflict-type vex-reachability-contradiction
|
||
stella unknowns list --conflict-type trust-tie
|
||
|
||
# Resolve a conflict manually
|
||
stella unknowns resolve-conflict unk-12345678-... \
|
||
--winner vendor-a \
|
||
--reason "vendor-a is the upstream maintainer"
|
||
```
|
||
|
||
### 8.6 Grey Queue Summary
|
||
|
||
```bash
|
||
# Get grey queue summary
|
||
stella unknowns summary --grey
|
||
|
||
# Output:
|
||
# Grey Queue: 23 items
|
||
#
|
||
# By State:
|
||
# PendingDeterminization: 15 (65%)
|
||
# Disputed: 5 (22%)
|
||
# GuardedPass: 3 (13%)
|
||
#
|
||
# Conflicts: 8 items have conflicts
|
||
# Avg. Triggers: 2.3 per item
|
||
# Oldest: 7 days
|
||
```
|
||
|
||
### 8.7 Grey Queue Export
|
||
|
||
```bash
|
||
# Export grey queue for analysis
|
||
stella unknowns export --state grey --format json --output grey-queue.json
|
||
|
||
# Export with full fingerprints and triggers
|
||
stella unknowns export --state grey --verbose --output grey-full.json
|
||
|
||
# Export conflicts only
|
||
stella unknowns export --has-conflicts --format csv --output conflicts.csv
|
||
```
|
||
|
||
---
|
||
|
||
**Last Updated**: 2026-01-16
|
||
**Version**: 1.1.0
|
||
**Sprint**: SPRINT_20260112_010_CLI_unknowns_grey_queue_cli
|