feat: Add operations runbooks and UI API models for Sprint 3500.0004.x
Operations documentation: - docs/operations/reachability-runbook.md - Reachability troubleshooting guide - docs/operations/unknowns-queue-runbook.md - Unknowns queue management guide UI TypeScript models: - src/Web/StellaOps.Web/src/app/core/api/proof.models.ts - Proof ledger types - src/Web/StellaOps.Web/src/app/core/api/reachability.models.ts - Reachability types - src/Web/StellaOps.Web/src/app/core/api/unknowns.models.ts - Unknowns queue types Sprint: SPRINT_3500_0004_0002 (UI), SPRINT_3500_0004_0004 (Docs)
This commit is contained in:
594
docs/operations/reachability-runbook.md
Normal file
594
docs/operations/reachability-runbook.md
Normal file
@@ -0,0 +1,594 @@
|
||||
# Reachability Analysis Operations Runbook
|
||||
|
||||
> **Version**: 1.0.0
|
||||
> **Sprint**: 3500.0004.0004
|
||||
> **Last Updated**: 2025-12-20
|
||||
|
||||
This runbook covers operational procedures for Reachability Analysis, including call graph management, analysis troubleshooting, and explain queries.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#1-overview)
|
||||
2. [Call Graph Operations](#2-call-graph-operations)
|
||||
3. [Reachability Computation](#3-reachability-computation)
|
||||
4. [Explain Queries](#4-explain-queries)
|
||||
5. [Troubleshooting](#5-troubleshooting)
|
||||
6. [Monitoring & Alerting](#6-monitoring--alerting)
|
||||
7. [Escalation Procedures](#7-escalation-procedures)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
### What is Reachability Analysis?
|
||||
|
||||
Reachability Analysis determines whether vulnerable code is actually reachable from application entrypoints. This reduces false positives by filtering out vulnerabilities in code that cannot be executed.
|
||||
|
||||
### Reachability Statuses
|
||||
|
||||
| Status | Confidence | Description |
|
||||
|--------|------------|-------------|
|
||||
| `UNREACHABLE` | High | No path from entrypoints to vulnerable code |
|
||||
| `POSSIBLY_REACHABLE` | Medium | Path exists but contains heuristic edges |
|
||||
| `REACHABLE_STATIC` | High | Static analysis proves path exists |
|
||||
| `REACHABLE_PROVEN` | Very High | Runtime evidence confirms execution |
|
||||
| `UNKNOWN` | Low | Insufficient data to determine |
|
||||
|
||||
### Key Components
|
||||
|
||||
| Component | Purpose | Location |
|
||||
|-----------|---------|----------|
|
||||
| Call Graph Extractor | Language-specific CG extraction | Scanner Worker plugins |
|
||||
| Call Graph Store | Persistent graph storage | `scanner.cg_node`, `scanner.cg_edge` |
|
||||
| Reachability Analyzer | BFS pathfinding algorithm | Scanner Core library |
|
||||
| Entrypoint Detector | Identifies application entrypoints | Language-specific plugins |
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Access to Scanner WebService API
|
||||
- `scanner.reachability` OAuth scope
|
||||
- CLI access with `stella` configured
|
||||
- Language-specific workers deployed (dotnet, java, etc.)
|
||||
|
||||
---
|
||||
|
||||
## 2. Call Graph Operations
|
||||
|
||||
### 2.1 Call Graph Upload
|
||||
|
||||
```bash
|
||||
# Upload via API
|
||||
curl -X POST "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/callgraphs" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Content-Digest: sha256=$(sha256sum callgraph.json | cut -d' ' -f1)" \
|
||||
-d @callgraph.json
|
||||
|
||||
# Upload via CLI
|
||||
stella scan graph upload --scan-id $SCAN_ID --file callgraph.json
|
||||
|
||||
# Upload streaming NDJSON (for large graphs)
|
||||
stella scan graph upload --scan-id $SCAN_ID \
|
||||
--file callgraph.ndjson \
|
||||
--format ndjson \
|
||||
--streaming
|
||||
```
|
||||
|
||||
### 2.2 Call Graph Inspection
|
||||
|
||||
```bash
|
||||
# Get call graph summary
|
||||
stella scan graph summary --scan-id $SCAN_ID
|
||||
|
||||
# Output:
|
||||
# Nodes: 12,345
|
||||
# Edges: 56,789
|
||||
# Entrypoints: 42
|
||||
# Languages: [dotnet, java]
|
||||
# Size: 15.2 MB
|
||||
|
||||
# List entrypoints
|
||||
stella scan graph entrypoints --scan-id $SCAN_ID
|
||||
|
||||
# Export full graph (for debugging)
|
||||
stella scan graph export --scan-id $SCAN_ID --output graph.json
|
||||
|
||||
# Visualize subgraph (requires GraphViz)
|
||||
stella scan graph visualize --scan-id $SCAN_ID \
|
||||
--node sha256:node123... \
|
||||
--depth 3 \
|
||||
--output subgraph.svg
|
||||
```
|
||||
|
||||
### 2.3 Call Graph Validation
|
||||
|
||||
```bash
|
||||
# Validate graph structure
|
||||
stella scan graph validate --scan-id $SCAN_ID
|
||||
|
||||
# Checks performed:
|
||||
# - All edge targets exist as nodes
|
||||
# - Entrypoints reference valid nodes
|
||||
# - No orphan nodes
|
||||
# - No cycles in entrypoint definitions
|
||||
# - Schema compliance
|
||||
|
||||
# Validate before upload
|
||||
stella scan graph validate --file callgraph.json --strict
|
||||
```
|
||||
|
||||
### 2.4 Call Graph Merging
|
||||
|
||||
When multiple language workers produce graphs:
|
||||
|
||||
```bash
|
||||
# View merge status
|
||||
stella scan graph merges --scan-id $SCAN_ID
|
||||
|
||||
# Output:
|
||||
# Language | Nodes | Edges | Status
|
||||
# dotnet | 8,234 | 34,567 | merged
|
||||
# java | 4,111 | 22,222 | merged
|
||||
# Total | 12,345 | 56,789 | complete
|
||||
|
||||
# Force re-merge (after fix)
|
||||
stella scan graph merge --scan-id $SCAN_ID --force
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Reachability Computation
|
||||
|
||||
### 3.1 Triggering Computation
|
||||
|
||||
```bash
|
||||
# Trigger via API
|
||||
curl -X POST "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/reachability/compute" \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
|
||||
# Trigger via CLI
|
||||
stella reachability compute --scan-id $SCAN_ID
|
||||
|
||||
# Trigger with options
|
||||
stella reachability compute --scan-id $SCAN_ID \
|
||||
--max-depth 20 \
|
||||
--indirect-resolution conservative \
|
||||
--timeout 300s
|
||||
```
|
||||
|
||||
### 3.2 Computation Options
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `max-depth` | 10 | Maximum path length to explore |
|
||||
| `indirect-resolution` | `conservative` | How to handle indirect calls: `conservative`, `aggressive`, `skip` |
|
||||
| `timeout` | 300s | Maximum computation time |
|
||||
| `parallel` | true | Parallel BFS from multiple entrypoints |
|
||||
| `include-runtime` | true | Merge runtime evidence if available |
|
||||
|
||||
### 3.3 Job Monitoring
|
||||
|
||||
```bash
|
||||
# Check job status
|
||||
stella reachability job-status --job-id reachability-job-001
|
||||
|
||||
# Output:
|
||||
# Status: running
|
||||
# Progress: 67% (8,234 / 12,345 nodes visited)
|
||||
# Started: 2025-12-20T10:00:00Z
|
||||
# Estimated completion: 2025-12-20T10:02:30Z
|
||||
|
||||
# Stream job logs
|
||||
stella reachability job-logs --job-id reachability-job-001 --follow
|
||||
|
||||
# Cancel running job
|
||||
stella reachability job-cancel --job-id reachability-job-001
|
||||
```
|
||||
|
||||
### 3.4 Computation Results
|
||||
|
||||
```bash
|
||||
# Get summary
|
||||
stella reachability summary --scan-id $SCAN_ID
|
||||
|
||||
# Output:
|
||||
# Total vulnerabilities: 45
|
||||
# Unreachable: 38 (84%)
|
||||
# Possibly reachable: 4 (9%)
|
||||
# Reachable (static): 2 (4%)
|
||||
# Reachable (proven): 1 (2%)
|
||||
# Unknown: 0 (0%)
|
||||
|
||||
# Get detailed findings
|
||||
stella reachability findings --scan-id $SCAN_ID --format json
|
||||
|
||||
# Filter by status
|
||||
stella reachability findings --scan-id $SCAN_ID --status REACHABLE_STATIC
|
||||
|
||||
# Export for CI gate
|
||||
stella reachability findings --scan-id $SCAN_ID \
|
||||
--status REACHABLE_STATIC,REACHABLE_PROVEN \
|
||||
--format sarif \
|
||||
--output findings.sarif
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Explain Queries
|
||||
|
||||
### 4.1 Explain Single Finding
|
||||
|
||||
```bash
|
||||
# Via API
|
||||
curl "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/reachability/explain?cve=CVE-2024-1234&purl=pkg:npm/lodash@4.17.20" \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
|
||||
# Via CLI
|
||||
stella reachability explain --scan-id $SCAN_ID \
|
||||
--cve CVE-2024-1234 \
|
||||
--purl "pkg:npm/lodash@4.17.20"
|
||||
|
||||
# Output:
|
||||
# Status: REACHABLE_STATIC
|
||||
# Confidence: 0.70
|
||||
#
|
||||
# Shortest Path (depth=3):
|
||||
# [0] MyApp.Controllers.OrdersController::Get(Guid)
|
||||
# Entrypoint: HTTP GET /api/orders/{id}
|
||||
# [1] MyApp.Services.OrderService::Process(Order)
|
||||
# Edge: static (direct_call)
|
||||
# [2] Lodash.merge(Object, Object) [VULNERABLE]
|
||||
# Edge: static (direct_call)
|
||||
#
|
||||
# Why Reachable:
|
||||
# - Static call path exists from HTTP entrypoint
|
||||
# - All edges are statically proven
|
||||
# - Vulnerable function is directly invoked
|
||||
```
|
||||
|
||||
### 4.2 Explain with Alternatives
|
||||
|
||||
```bash
|
||||
# Show all paths (not just shortest)
|
||||
stella reachability explain --scan-id $SCAN_ID \
|
||||
--cve CVE-2024-1234 \
|
||||
--purl "pkg:npm/lodash@4.17.20" \
|
||||
--all-paths
|
||||
|
||||
# Output includes:
|
||||
# Alternative paths found: 3
|
||||
# Path 1 (depth=3): ... [shown above]
|
||||
# Path 2 (depth=5): Controllers.UserController -> ... -> Lodash.merge
|
||||
# Path 3 (depth=7): Background.JobProcessor -> ... -> Lodash.merge
|
||||
```
|
||||
|
||||
### 4.3 Why Unreachable
|
||||
|
||||
```bash
|
||||
# Explain why vulnerability is unreachable
|
||||
stella reachability explain --scan-id $SCAN_ID \
|
||||
--cve CVE-2024-5678 \
|
||||
--purl "pkg:npm/unused-lib@1.0.0"
|
||||
|
||||
# Output:
|
||||
# Status: UNREACHABLE
|
||||
# Confidence: 0.95
|
||||
#
|
||||
# Why Unreachable:
|
||||
# - No path found from any entrypoint
|
||||
# - Vulnerable function: UnusedLib.dangerousMethod()
|
||||
# - Function visibility: private
|
||||
# - Callers found: 0
|
||||
# - Dead code analysis: likely dead code
|
||||
```
|
||||
|
||||
### 4.4 Batch Explain
|
||||
|
||||
```bash
|
||||
# Export all reachability explanations
|
||||
stella reachability explain-all --scan-id $SCAN_ID \
|
||||
--output explanations.json
|
||||
|
||||
# Explain only reachable findings
|
||||
stella reachability explain-all --scan-id $SCAN_ID \
|
||||
--status REACHABLE_STATIC,REACHABLE_PROVEN \
|
||||
--output reachable-explanations.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Troubleshooting
|
||||
|
||||
### 5.1 Call Graph Too Large
|
||||
|
||||
**Symptom**: Upload fails with "413 Payload Too Large".
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check graph size
|
||||
du -h callgraph.json
|
||||
wc -l callgraph.json
|
||||
|
||||
# Count nodes/edges
|
||||
jq '.nodes | length' callgraph.json
|
||||
jq '.edges | length' callgraph.json
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Option 1: Use streaming upload
|
||||
stella scan graph upload --scan-id $SCAN_ID \
|
||||
--file callgraph.json \
|
||||
--streaming
|
||||
|
||||
# Option 2: Convert to NDJSON
|
||||
stella scan graph convert --input callgraph.json \
|
||||
--output callgraph.ndjson \
|
||||
--format ndjson
|
||||
|
||||
# Option 3: Partition by artifact
|
||||
stella scan graph partition --input callgraph.json \
|
||||
--output-dir ./partitions/ \
|
||||
--by artifact
|
||||
```
|
||||
|
||||
### 5.2 Missing Entrypoints
|
||||
|
||||
**Symptom**: "No entrypoints found" warning.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check entrypoint detection
|
||||
stella scan graph entrypoints --scan-id $SCAN_ID --verbose
|
||||
|
||||
# Check for framework detection
|
||||
stella scan graph detect-framework --scan-id $SCAN_ID
|
||||
```
|
||||
|
||||
**Common causes**:
|
||||
|
||||
1. **Framework not detected**: Add framework hints
|
||||
2. **Custom entrypoints**: Manually specify
|
||||
3. **Wrong language worker**: Check artifact analysis
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Specify framework explicitly
|
||||
stella scan graph upload --scan-id $SCAN_ID \
|
||||
--file callgraph.json \
|
||||
--framework aspnetcore
|
||||
|
||||
# Add custom entrypoints
|
||||
stella scan graph entrypoint add --scan-id $SCAN_ID \
|
||||
--node sha256:node123... \
|
||||
--kind http \
|
||||
--route "/api/custom"
|
||||
```
|
||||
|
||||
### 5.3 Reachability Computation Timeout
|
||||
|
||||
**Symptom**: Job fails with "computation timeout".
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check computation stats
|
||||
stella reachability job-stats --job-id reachability-job-001
|
||||
|
||||
# Output:
|
||||
# Nodes visited: 500,000
|
||||
# Edges traversed: 2,500,000
|
||||
# Time elapsed: 300s
|
||||
# Memory used: 4.2 GB
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Option 1: Increase timeout
|
||||
stella reachability compute --scan-id $SCAN_ID --timeout 600s
|
||||
|
||||
# Option 2: Reduce depth
|
||||
stella reachability compute --scan-id $SCAN_ID --max-depth 5
|
||||
|
||||
# Option 3: Skip indirect calls
|
||||
stella reachability compute --scan-id $SCAN_ID --indirect-resolution skip
|
||||
|
||||
# Option 4: Partition analysis
|
||||
stella reachability compute --scan-id $SCAN_ID --partition-by artifact
|
||||
```
|
||||
|
||||
### 5.4 Inconsistent Results
|
||||
|
||||
**Symptom**: Different results between runs.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check determinism settings
|
||||
stella scan manifest --scan-id $SCAN_ID | jq '.deterministic, .seed'
|
||||
|
||||
# Compare graph hashes
|
||||
stella scan graph hash --scan-id $SCAN_ID
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Ensure deterministic mode
|
||||
stella reachability compute --scan-id $SCAN_ID \
|
||||
--deterministic \
|
||||
--seed "AQIDBA==" # Fixed seed
|
||||
|
||||
# Use same graph version
|
||||
stella reachability compute --scan-id $SCAN_ID \
|
||||
--graph-digest sha256:cg123...
|
||||
```
|
||||
|
||||
### 5.5 False Positives/Negatives
|
||||
|
||||
**Symptom**: Reachability verdict seems incorrect.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Get detailed explanation
|
||||
stella reachability explain --scan-id $SCAN_ID \
|
||||
--cve CVE-2024-1234 \
|
||||
--purl "pkg:npm/lodash@4.17.20" \
|
||||
--verbose
|
||||
|
||||
# Check edge confidence
|
||||
stella scan graph edge --scan-id $SCAN_ID \
|
||||
--from sha256:nodeA... \
|
||||
--to sha256:nodeB...
|
||||
```
|
||||
|
||||
**Common causes for false positives**:
|
||||
|
||||
1. **Heuristic edges**: Indirect call resolution too aggressive
|
||||
2. **Reflection/dynamic calls**: May create false paths
|
||||
3. **Dead code not detected**: Code exists but never executes
|
||||
|
||||
**Common causes for false negatives**:
|
||||
|
||||
1. **Missing edges**: Call graph incomplete
|
||||
2. **Indirect calls skipped**: Resolution too conservative
|
||||
3. **Cross-language calls**: Language boundary not bridged
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Adjust indirect call resolution
|
||||
stella reachability compute --scan-id $SCAN_ID \
|
||||
--indirect-resolution conservative
|
||||
|
||||
# Add runtime evidence
|
||||
stella scan evidence upload --scan-id $SCAN_ID \
|
||||
--file runtime-trace.json
|
||||
|
||||
# Report false positive/negative for ML training
|
||||
stella reachability feedback --scan-id $SCAN_ID \
|
||||
--cve CVE-2024-1234 \
|
||||
--verdict false-positive \
|
||||
--reason "Dead code - feature flag disabled"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Monitoring & Alerting
|
||||
|
||||
### 6.1 Key Metrics
|
||||
|
||||
| Metric | Description | Alert Threshold |
|
||||
|--------|-------------|-----------------|
|
||||
| `callgraph_upload_duration_seconds` | Time to upload call graph | > 60s |
|
||||
| `callgraph_size_bytes` | Size of uploaded graphs | > 200MB |
|
||||
| `reachability_computation_duration_seconds` | Time to compute reachability | > 300s |
|
||||
| `reachability_nodes_visited` | Nodes visited during BFS | > 1M |
|
||||
| `reachability_job_failures_total` | Failed computation jobs | > 0/hour |
|
||||
| `entrypoint_detection_rate` | % of scans with entrypoints | < 90% |
|
||||
|
||||
### 6.2 Grafana Dashboard
|
||||
|
||||
```
|
||||
Dashboard: Reachability Operations
|
||||
Panels:
|
||||
- Call graph upload throughput
|
||||
- Graph size distribution
|
||||
- Computation duration (p50, p95, p99)
|
||||
- Reachability verdict distribution
|
||||
- Job queue depth
|
||||
- Entrypoint detection rate
|
||||
```
|
||||
|
||||
### 6.3 Alerting Rules
|
||||
|
||||
```yaml
|
||||
groups:
|
||||
- name: reachability
|
||||
rules:
|
||||
- alert: ReachabilityComputationSlow
|
||||
expr: histogram_quantile(0.95, reachability_computation_duration_seconds) > 300
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Reachability computation is slow"
|
||||
|
||||
- alert: ReachabilityJobFailures
|
||||
expr: increase(reachability_job_failures_total[1h]) > 5
|
||||
for: 5m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Multiple reachability job failures"
|
||||
|
||||
- alert: LowEntrypointDetectionRate
|
||||
expr: entrypoint_detection_rate < 0.8
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Entrypoint detection rate is low"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Escalation Procedures
|
||||
|
||||
### 7.1 Escalation Matrix
|
||||
|
||||
| Severity | Condition | Response Time | Escalation Path |
|
||||
|----------|-----------|---------------|-----------------|
|
||||
| P1 | Reachability failing for all scans | 15 min | On-call → Team Lead |
|
||||
| P2 | Computation failures > 20% | 1 hour | On-call → Team Lead |
|
||||
| P3 | Computation latency > 600s p95 | 4 hours | On-call |
|
||||
| P4 | Entrypoint detection < 70% | 24 hours | Ticket |
|
||||
|
||||
### 7.2 P1 Response Procedure
|
||||
|
||||
1. **Acknowledge** alert
|
||||
2. **Triage**:
|
||||
```bash
|
||||
# Check worker health
|
||||
stella scanner workers status
|
||||
|
||||
# Check graph store connectivity
|
||||
stella health check --service graph-store
|
||||
|
||||
# Check recent failures
|
||||
stella reachability jobs --status failed --last 10
|
||||
```
|
||||
3. **Mitigate**:
|
||||
```bash
|
||||
# Scale up workers if queue backlog
|
||||
kubectl scale deployment scanner-worker --replicas=10
|
||||
|
||||
# Clear stuck jobs
|
||||
stella reachability jobs cancel --status stuck
|
||||
```
|
||||
4. **Communicate**: Update status page
|
||||
5. **Resolve**: Fix root cause
|
||||
6. **Postmortem**: Document within 48 hours
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Reachability API Reference](../api/score-proofs-reachability-api-reference.md)
|
||||
- [Scanner Architecture](../modules/scanner/architecture.md)
|
||||
- [Call Graph Schema](../schemas/callgraph-v1.md)
|
||||
- [Entrypoint Detection](../modules/scanner/operations/entrypoint-problem.md)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-12-20
|
||||
**Version**: 1.0.0
|
||||
**Sprint**: 3500.0004.0004
|
||||
590
docs/operations/unknowns-queue-runbook.md
Normal file
590
docs/operations/unknowns-queue-runbook.md
Normal file
@@ -0,0 +1,590 @@
|
||||
# Unknowns Queue Management Runbook
|
||||
|
||||
> **Version**: 1.0.0
|
||||
> **Sprint**: 3500.0004.0004
|
||||
> **Last Updated**: 2025-12-20
|
||||
|
||||
This runbook covers operational procedures for managing the Unknowns queue, including triage, escalation, resolution, and queue health maintenance.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#1-overview)
|
||||
2. [Queue Operations](#2-queue-operations)
|
||||
3. [Triage Procedures](#3-triage-procedures)
|
||||
4. [Escalation Workflows](#4-escalation-workflows)
|
||||
5. [Resolution Procedures](#5-resolution-procedures)
|
||||
6. [Troubleshooting](#6-troubleshooting)
|
||||
7. [Monitoring & Alerting](#7-monitoring--alerting)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
### What are Unknowns?
|
||||
|
||||
Unknowns are items that could not be fully classified during scanning due to:
|
||||
|
||||
- Missing VEX statements
|
||||
- Ambiguous indirect calls in call graphs
|
||||
- Incomplete SBOM data
|
||||
- Missing advisory information
|
||||
- Conflicting evidence from multiple sources
|
||||
|
||||
### Unknown Ranking
|
||||
|
||||
Unknowns are ranked using a 2-factor scoring model:
|
||||
|
||||
```
|
||||
score = 0.60 × blast + 0.30 × scarcity + 0.30 × pressure + containment_deduction
|
||||
```
|
||||
|
||||
| Factor | Weight | Description |
|
||||
|--------|--------|-------------|
|
||||
| Blast Radius | 0.60 | Impact scope (dependents, network exposure) |
|
||||
| Evidence Scarcity | 0.30 | How much data is missing |
|
||||
| Exploit Pressure | 0.30 | EPSS score, KEV status |
|
||||
| Containment | -0.20 | Mitigation factors (seccomp, read-only FS) |
|
||||
|
||||
### Band Assignment
|
||||
|
||||
| Band | Score Range | Priority | SLA |
|
||||
|------|-------------|----------|-----|
|
||||
| HOT | ≥ 0.70 | Critical | 24 hours |
|
||||
| WARM | 0.40 - 0.69 | Normal | 7 days |
|
||||
| COLD | < 0.40 | Low | 30 days |
|
||||
|
||||
---
|
||||
|
||||
## 2. Queue Operations
|
||||
|
||||
### 2.1 View Queue Status
|
||||
|
||||
```bash
|
||||
# Get queue summary
|
||||
stella unknowns summary
|
||||
|
||||
# Output:
|
||||
# Total: 142 unknowns
|
||||
# HOT: 12 (8%) - Requires immediate attention
|
||||
# WARM: 85 (60%) - Normal priority
|
||||
# COLD: 45 (32%) - Low priority
|
||||
#
|
||||
# KEV items: 3
|
||||
# Average score: 0.52
|
||||
|
||||
# Get queue summary via API
|
||||
curl "https://scanner.example.com/api/v1/unknowns/summary" \
|
||||
-H "Authorization: Bearer $TOKEN"
|
||||
```
|
||||
|
||||
### 2.2 List Unknowns
|
||||
|
||||
```bash
|
||||
# List all HOT unknowns
|
||||
stella unknowns list --band HOT
|
||||
|
||||
# List by score (highest first)
|
||||
stella unknowns list --sort score --order desc --limit 20
|
||||
|
||||
# Filter by reason
|
||||
stella unknowns list --reason missing_vex
|
||||
|
||||
# Filter by artifact
|
||||
stella unknowns list --artifact sha256:abc123...
|
||||
|
||||
# Filter by KEV status
|
||||
stella unknowns list --kev true
|
||||
```
|
||||
|
||||
### 2.3 View Unknown Details
|
||||
|
||||
```bash
|
||||
# Get detailed view
|
||||
stella unknowns show unk-12345678-abcd-1234-5678-abcdef123456
|
||||
|
||||
# Output:
|
||||
# ID: unk-12345678-...
|
||||
# Artifact: pkg:oci/myapp@sha256:abc123
|
||||
# Reasons: [missing_vex, ambiguous_indirect_call]
|
||||
#
|
||||
# Blast Radius:
|
||||
# Dependents: 15 services
|
||||
# Network: internet-facing
|
||||
# Privilege: user
|
||||
#
|
||||
# Evidence Scarcity: 0.7 (high)
|
||||
#
|
||||
# Exploit Pressure:
|
||||
# EPSS: 0.45
|
||||
# KEV: false
|
||||
#
|
||||
# Containment:
|
||||
# Seccomp: enforced (-0.10)
|
||||
# Filesystem: read-only (-0.10)
|
||||
#
|
||||
# Score: 0.62 (WARM band)
|
||||
# Score Breakdown:
|
||||
# Blast component: +0.35
|
||||
# Scarcity component: +0.21
|
||||
# Pressure component: +0.26
|
||||
# Containment deduction: -0.20
|
||||
|
||||
# Show proof tree
|
||||
stella unknowns proof unk-12345678-...
|
||||
```
|
||||
|
||||
### 2.4 Export Queue Data
|
||||
|
||||
```bash
|
||||
# Export for analysis
|
||||
stella unknowns export --format json --output unknowns.json
|
||||
|
||||
# Export HOT items for daily review
|
||||
stella unknowns export --band HOT --format csv --output hot-unknowns.csv
|
||||
|
||||
# Export with full details
|
||||
stella unknowns export --verbose --include-proofs --output full-export.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Triage Procedures
|
||||
|
||||
### 3.1 Daily Triage Workflow
|
||||
|
||||
**Schedule**: Daily at 9:00 AM
|
||||
|
||||
**Duration**: 30 minutes
|
||||
|
||||
**Participants**: Security analyst, on-call engineer
|
||||
|
||||
**Process**:
|
||||
|
||||
```bash
|
||||
# 1. Get today's queue snapshot
|
||||
stella unknowns snapshot --output daily-$(date +%Y%m%d).json
|
||||
|
||||
# 2. Review all HOT items
|
||||
stella unknowns list --band HOT --since 24h
|
||||
|
||||
# 3. For each HOT unknown, determine action:
|
||||
# - Escalate: Trigger immediate rescan
|
||||
# - Investigate: Needs manual analysis
|
||||
# - Defer: Move to WARM (with justification)
|
||||
# - Resolve: Evidence found, can close
|
||||
|
||||
# 4. Process each item
|
||||
stella unknowns triage unk-12345678-... --action escalate
|
||||
stella unknowns triage unk-87654321-... --action investigate --notes "Need VEX from vendor"
|
||||
stella unknowns triage unk-11111111-... --action defer --reason "False positive suspected"
|
||||
```
|
||||
|
||||
### 3.2 Triage Decision Matrix
|
||||
|
||||
| Reason Code | KEV | EPSS > 0.5 | Action |
|
||||
|-------------|-----|------------|--------|
|
||||
| `missing_vex` | Yes | Any | Escalate + Vendor outreach |
|
||||
| `missing_vex` | No | Yes | Escalate |
|
||||
| `missing_vex` | No | No | Request VEX |
|
||||
| `ambiguous_indirect_call` | Any | Any | Manual code review |
|
||||
| `incomplete_sbom` | Any | Any | Rescan with updated extractor |
|
||||
| `conflicting_evidence` | Any | Any | Manual analysis |
|
||||
|
||||
### 3.3 Triage Templates
|
||||
|
||||
```bash
|
||||
# Quick escalate (HOT + KEV)
|
||||
stella unknowns triage unk-... --action escalate \
|
||||
--priority P1 \
|
||||
--notes "KEV item, requires immediate attention"
|
||||
|
||||
# Request vendor VEX
|
||||
stella unknowns triage unk-... --action investigate \
|
||||
--notes "Requested VEX from vendor via security@vendor.com" \
|
||||
--due-date 7d
|
||||
|
||||
# Mark for code review
|
||||
stella unknowns triage unk-... --action investigate \
|
||||
--notes "Requires manual code review to resolve indirect call" \
|
||||
--assign @code-review-team
|
||||
|
||||
# Defer with justification
|
||||
stella unknowns triage unk-... --action defer \
|
||||
--reason "Component not deployed to production" \
|
||||
--evidence "deployment-manifest.yaml shows staging-only"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Escalation Workflows
|
||||
|
||||
### 4.1 Automatic Escalation
|
||||
|
||||
Unknowns are automatically escalated when:
|
||||
|
||||
- Score increases above HOT threshold (0.70)
|
||||
- KEV status added to related CVE
|
||||
- EPSS score increases significantly (> 0.2 delta)
|
||||
- Blast radius increases (new dependents detected)
|
||||
|
||||
**Configure auto-escalation**:
|
||||
|
||||
```yaml
|
||||
# policy.unknowns.escalation.yaml
|
||||
autoEscalation:
|
||||
enabled: true
|
||||
triggers:
|
||||
- condition: score >= 0.70
|
||||
action: escalate
|
||||
notify: [security-team]
|
||||
- condition: kev == true
|
||||
action: escalate
|
||||
priority: P1
|
||||
notify: [security-team, management]
|
||||
- condition: epss_delta > 0.2
|
||||
action: escalate
|
||||
notify: [security-team]
|
||||
```
|
||||
|
||||
### 4.2 Manual Escalation
|
||||
|
||||
```bash
|
||||
# Escalate via CLI
|
||||
stella unknowns escalate unk-12345678-...
|
||||
|
||||
# Escalate with reason
|
||||
stella unknowns escalate unk-12345678-... \
|
||||
--reason "Customer reported potential exploit"
|
||||
|
||||
# Escalate to trigger rescan
|
||||
stella unknowns escalate unk-12345678-... --rescan
|
||||
|
||||
# Output:
|
||||
# Escalated: unk-12345678-...
|
||||
# Rescan job: rescan-job-001
|
||||
# Status: queued
|
||||
# ETA: 5 minutes
|
||||
```
|
||||
|
||||
### 4.3 Bulk Escalation
|
||||
|
||||
```bash
|
||||
# Escalate all KEV items
|
||||
stella unknowns escalate --filter "kev=true" --reason "KEV bulk escalation"
|
||||
|
||||
# Escalate high-score items
|
||||
stella unknowns escalate --filter "score>=0.8" --rescan
|
||||
|
||||
# Escalate by artifact
|
||||
stella unknowns escalate --artifact sha256:abc123... --reason "Production incident"
|
||||
```
|
||||
|
||||
### 4.4 Escalation SLA Tracking
|
||||
|
||||
```bash
|
||||
# Check SLA status
|
||||
stella unknowns sla-status
|
||||
|
||||
# Output:
|
||||
# HOT unknowns SLA (24h):
|
||||
# In SLA: 10 (83%)
|
||||
# Breached: 2 (17%)
|
||||
#
|
||||
# Breached items:
|
||||
# unk-111... (26h old) - missing_vex
|
||||
# unk-222... (30h old) - conflicting_evidence
|
||||
|
||||
# Get SLA breach notifications
|
||||
stella unknowns list --sla-breached
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Resolution Procedures
|
||||
|
||||
### 5.1 Resolution Types
|
||||
|
||||
| Resolution | Description | Evidence Required |
|
||||
|------------|-------------|-------------------|
|
||||
| `not_affected` | Vulnerability doesn't apply | VEX statement or manual analysis |
|
||||
| `fixed` | Vulnerability patched | Version upgrade confirmation |
|
||||
| `mitigated` | Controls in place | Mitigation documentation |
|
||||
| `false_positive` | Incorrect classification | Analysis report |
|
||||
| `wont_fix` | Accepted risk | Risk acceptance form |
|
||||
|
||||
### 5.2 Resolve Unknown
|
||||
|
||||
```bash
|
||||
# Resolve as not affected
|
||||
stella unknowns resolve unk-12345678-... \
|
||||
--resolution not_affected \
|
||||
--justification "vulnerable_code_not_present" \
|
||||
--notes "Manual code review confirmed function not used"
|
||||
|
||||
# Resolve as fixed
|
||||
stella unknowns resolve unk-12345678-... \
|
||||
--resolution fixed \
|
||||
--justification "version_upgraded" \
|
||||
--evidence "Upgraded lodash to 4.17.21, CVE patched"
|
||||
|
||||
# Resolve as mitigated
|
||||
stella unknowns resolve unk-12345678-... \
|
||||
--resolution mitigated \
|
||||
--justification "inline_mitigations_exist" \
|
||||
--evidence "WAF rule WAF-001 blocks exploit pattern"
|
||||
|
||||
# Resolve as won't fix (risk accepted)
|
||||
stella unknowns resolve unk-12345678-... \
|
||||
--resolution wont_fix \
|
||||
--justification "risk_accepted" \
|
||||
--evidence "Risk acceptance ticket RISK-123" \
|
||||
--expires 90d # Re-evaluate in 90 days
|
||||
```
|
||||
|
||||
### 5.3 Bulk Resolution
|
||||
|
||||
```bash
|
||||
# Resolve all items for a fixed package version
|
||||
stella unknowns resolve-batch \
|
||||
--filter "purl=pkg:npm/lodash@4.17.20" \
|
||||
--resolution fixed \
|
||||
--justification "Upgraded to 4.17.21 fleet-wide" \
|
||||
--evidence "Fleet upgrade ticket FLEET-456"
|
||||
|
||||
# Resolve false positives from analysis
|
||||
stella unknowns resolve-batch \
|
||||
--file false-positives.json \
|
||||
--resolution false_positive
|
||||
```
|
||||
|
||||
### 5.4 Resolution Audit Trail
|
||||
|
||||
```bash
|
||||
# View resolution history
|
||||
stella unknowns history unk-12345678-...
|
||||
|
||||
# Output:
|
||||
# 2025-12-15 10:00:00 - Created (score: 0.62)
|
||||
# 2025-12-16 09:30:00 - Triaged by analyst@example.com
|
||||
# 2025-12-17 14:00:00 - Escalated (KEV added)
|
||||
# 2025-12-18 11:00:00 - Resolved by security@example.com
|
||||
# Resolution: not_affected
|
||||
# Justification: vulnerable_code_not_present
|
||||
# Notes: Manual code review confirmed function not used
|
||||
|
||||
# Export audit trail
|
||||
stella unknowns audit-export --from 2025-01-01 --to 2025-12-31 --output audit.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Troubleshooting
|
||||
|
||||
### 6.1 Score Seems Wrong
|
||||
|
||||
**Symptom**: Unknown scored too high or too low.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# View score breakdown
|
||||
stella unknowns show unk-... --score-details
|
||||
|
||||
# View proof tree
|
||||
stella unknowns proof unk-... --verbose
|
||||
```
|
||||
|
||||
**Common causes**:
|
||||
|
||||
1. **Stale EPSS data**: EPSS feed not updated
|
||||
2. **Incorrect blast radius**: Dependency data outdated
|
||||
3. **Missing containment data**: Seccomp/filesystem status unknown
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Trigger score recalculation
|
||||
stella unknowns recalculate unk-...
|
||||
|
||||
# Force refresh of all input signals
|
||||
stella unknowns refresh unk-... --force
|
||||
```
|
||||
|
||||
### 6.2 Duplicate Unknowns
|
||||
|
||||
**Symptom**: Same issue appears multiple times.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Find potential duplicates
|
||||
stella unknowns duplicates --scan
|
||||
|
||||
# Output shows items with same CVE+PURL but different artifacts
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Merge duplicates
|
||||
stella unknowns merge \
|
||||
--primary unk-111... \
|
||||
--secondary unk-222... \
|
||||
--reason "Same CVE across artifact versions"
|
||||
```
|
||||
|
||||
### 6.3 Escalation Not Working
|
||||
|
||||
**Symptom**: Escalation doesn't trigger rescan.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check escalation status
|
||||
stella unknowns escalation-status unk-...
|
||||
|
||||
# Check Scheduler connectivity
|
||||
stella health check --service scheduler
|
||||
|
||||
# Check job queue
|
||||
stella scheduler queue status rescan
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Retry escalation
|
||||
stella unknowns escalate unk-... --force
|
||||
|
||||
# Manual rescan trigger
|
||||
stella scan trigger --artifact sha256:abc123... --priority high
|
||||
```
|
||||
|
||||
### 6.4 Resolution Rejected
|
||||
|
||||
**Symptom**: Resolution attempt fails validation.
|
||||
|
||||
**Diagnosis**:
|
||||
|
||||
```bash
|
||||
# Check resolution requirements
|
||||
stella unknowns resolution-requirements unk-...
|
||||
|
||||
# Output:
|
||||
# Resolution requirements for unk-12345678-...
|
||||
# - Justification: required
|
||||
# - Evidence: required (reason: KEV item)
|
||||
# - Approver: required (band: HOT)
|
||||
```
|
||||
|
||||
**Resolution**:
|
||||
|
||||
```bash
|
||||
# Provide required evidence
|
||||
stella unknowns resolve unk-... \
|
||||
--resolution not_affected \
|
||||
--justification "vulnerable_code_not_present" \
|
||||
--evidence "Code review: CRV-123" \
|
||||
--approver security-lead@example.com
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Monitoring & Alerting
|
||||
|
||||
### 7.1 Key Metrics
|
||||
|
||||
| Metric | Description | Alert Threshold |
|
||||
|--------|-------------|-----------------|
|
||||
| `unknowns_total` | Total unknowns in queue | > 500 |
|
||||
| `unknowns_hot_count` | HOT band count | > 20 |
|
||||
| `unknowns_sla_breached` | SLA breaches | > 0 |
|
||||
| `unknowns_resolution_rate` | Daily resolutions | < 5 |
|
||||
| `unknowns_escalation_failures` | Failed escalations | > 0 |
|
||||
| `unknowns_avg_age_hours` | Average unknown age | > 168 (1 week) |
|
||||
|
||||
### 7.2 Grafana Dashboard
|
||||
|
||||
```
|
||||
Dashboard: Unknowns Queue Health
|
||||
Panels:
|
||||
- Queue size by band (HOT/WARM/COLD)
|
||||
- SLA compliance rate
|
||||
- Unknowns by reason code
|
||||
- Resolution velocity
|
||||
- Escalation success rate
|
||||
- Queue age distribution
|
||||
- KEV item tracking
|
||||
```
|
||||
|
||||
### 7.3 Alerting Rules
|
||||
|
||||
```yaml
|
||||
groups:
|
||||
- name: unknowns-queue
|
||||
rules:
|
||||
- alert: UnknownsHotBandHigh
|
||||
expr: unknowns_hot_count > 20
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "HOT unknowns queue is high ({{ $value }} items)"
|
||||
|
||||
- alert: UnknownsSLABreach
|
||||
expr: unknowns_sla_breached > 0
|
||||
for: 1m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "{{ $value }} unknowns have breached SLA"
|
||||
|
||||
- alert: UnknownsQueueGrowing
|
||||
expr: rate(unknowns_total[1h]) > 10
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Unknowns queue is growing rapidly"
|
||||
|
||||
- alert: UnknownsKEVPending
|
||||
expr: unknowns_kev_count > 0 and unknowns_kev_unresolved_age_hours > 24
|
||||
for: 5m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "KEV unknown pending for over 24 hours"
|
||||
```
|
||||
|
||||
### 7.4 Daily Report
|
||||
|
||||
```bash
|
||||
# Generate daily report
|
||||
stella unknowns report --format email --send-to security-team@example.com
|
||||
|
||||
# Report includes:
|
||||
# - Queue summary (total, by band, by reason)
|
||||
# - SLA status (in compliance, breaches)
|
||||
# - Top 10 highest-scored items
|
||||
# - Newly added items (last 24h)
|
||||
# - Resolved items (last 24h)
|
||||
# - KEV item status
|
||||
# - Trends (7-day, 30-day)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Unknowns API Reference](../api/score-proofs-reachability-api-reference.md#5-unknowns-api)
|
||||
- [Triage Technical Reference](../product-advisories/14-Dec-2025%20-%20Triage%20and%20Unknowns%20Technical%20Reference.md)
|
||||
- [Score Proofs Runbook](./score-proofs-runbook.md)
|
||||
- [Policy Engine](../modules/policy/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-12-20
|
||||
**Version**: 1.0.0
|
||||
**Sprint**: 3500.0004.0004
|
||||
Reference in New Issue
Block a user