Operations documentation: - docs/operations/reachability-runbook.md - Reachability troubleshooting guide - docs/operations/unknowns-queue-runbook.md - Unknowns queue management guide UI TypeScript models: - src/Web/StellaOps.Web/src/app/core/api/proof.models.ts - Proof ledger types - src/Web/StellaOps.Web/src/app/core/api/reachability.models.ts - Reachability types - src/Web/StellaOps.Web/src/app/core/api/unknowns.models.ts - Unknowns queue types Sprint: SPRINT_3500_0004_0002 (UI), SPRINT_3500_0004_0004 (Docs)
15 KiB
Reachability Analysis Operations Runbook
Version: 1.0.0
Sprint: 3500.0004.0004
Last Updated: 2025-12-20
This runbook covers operational procedures for Reachability Analysis, including call graph management, analysis troubleshooting, and explain queries.
Table of Contents
- Overview
- Call Graph Operations
- Reachability Computation
- Explain Queries
- Troubleshooting
- Monitoring & Alerting
- Escalation Procedures
1. Overview
What is Reachability Analysis?
Reachability Analysis determines whether vulnerable code is actually reachable from application entrypoints. This reduces false positives by filtering out vulnerabilities in code that cannot be executed.
Reachability Statuses
| Status | Confidence | Description |
|---|---|---|
UNREACHABLE |
High | No path from entrypoints to vulnerable code |
POSSIBLY_REACHABLE |
Medium | Path exists but contains heuristic edges |
REACHABLE_STATIC |
High | Static analysis proves path exists |
REACHABLE_PROVEN |
Very High | Runtime evidence confirms execution |
UNKNOWN |
Low | Insufficient data to determine |
Key Components
| Component | Purpose | Location |
|---|---|---|
| Call Graph Extractor | Language-specific CG extraction | Scanner Worker plugins |
| Call Graph Store | Persistent graph storage | scanner.cg_node, scanner.cg_edge |
| Reachability Analyzer | BFS pathfinding algorithm | Scanner Core library |
| Entrypoint Detector | Identifies application entrypoints | Language-specific plugins |
Prerequisites
- Access to Scanner WebService API
scanner.reachabilityOAuth scope- CLI access with
stellaconfigured - Language-specific workers deployed (dotnet, java, etc.)
2. Call Graph Operations
2.1 Call Graph Upload
# Upload via API
curl -X POST "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/callgraphs" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Content-Digest: sha256=$(sha256sum callgraph.json | cut -d' ' -f1)" \
-d @callgraph.json
# Upload via CLI
stella scan graph upload --scan-id $SCAN_ID --file callgraph.json
# Upload streaming NDJSON (for large graphs)
stella scan graph upload --scan-id $SCAN_ID \
--file callgraph.ndjson \
--format ndjson \
--streaming
2.2 Call Graph Inspection
# Get call graph summary
stella scan graph summary --scan-id $SCAN_ID
# Output:
# Nodes: 12,345
# Edges: 56,789
# Entrypoints: 42
# Languages: [dotnet, java]
# Size: 15.2 MB
# List entrypoints
stella scan graph entrypoints --scan-id $SCAN_ID
# Export full graph (for debugging)
stella scan graph export --scan-id $SCAN_ID --output graph.json
# Visualize subgraph (requires GraphViz)
stella scan graph visualize --scan-id $SCAN_ID \
--node sha256:node123... \
--depth 3 \
--output subgraph.svg
2.3 Call Graph Validation
# Validate graph structure
stella scan graph validate --scan-id $SCAN_ID
# Checks performed:
# - All edge targets exist as nodes
# - Entrypoints reference valid nodes
# - No orphan nodes
# - No cycles in entrypoint definitions
# - Schema compliance
# Validate before upload
stella scan graph validate --file callgraph.json --strict
2.4 Call Graph Merging
When multiple language workers produce graphs:
# View merge status
stella scan graph merges --scan-id $SCAN_ID
# Output:
# Language | Nodes | Edges | Status
# dotnet | 8,234 | 34,567 | merged
# java | 4,111 | 22,222 | merged
# Total | 12,345 | 56,789 | complete
# Force re-merge (after fix)
stella scan graph merge --scan-id $SCAN_ID --force
3. Reachability Computation
3.1 Triggering Computation
# Trigger via API
curl -X POST "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/reachability/compute" \
-H "Authorization: Bearer $TOKEN"
# Trigger via CLI
stella reachability compute --scan-id $SCAN_ID
# Trigger with options
stella reachability compute --scan-id $SCAN_ID \
--max-depth 20 \
--indirect-resolution conservative \
--timeout 300s
3.2 Computation Options
| Option | Default | Description |
|---|---|---|
max-depth |
10 | Maximum path length to explore |
indirect-resolution |
conservative |
How to handle indirect calls: conservative, aggressive, skip |
timeout |
300s | Maximum computation time |
parallel |
true | Parallel BFS from multiple entrypoints |
include-runtime |
true | Merge runtime evidence if available |
3.3 Job Monitoring
# Check job status
stella reachability job-status --job-id reachability-job-001
# Output:
# Status: running
# Progress: 67% (8,234 / 12,345 nodes visited)
# Started: 2025-12-20T10:00:00Z
# Estimated completion: 2025-12-20T10:02:30Z
# Stream job logs
stella reachability job-logs --job-id reachability-job-001 --follow
# Cancel running job
stella reachability job-cancel --job-id reachability-job-001
3.4 Computation Results
# Get summary
stella reachability summary --scan-id $SCAN_ID
# Output:
# Total vulnerabilities: 45
# Unreachable: 38 (84%)
# Possibly reachable: 4 (9%)
# Reachable (static): 2 (4%)
# Reachable (proven): 1 (2%)
# Unknown: 0 (0%)
# Get detailed findings
stella reachability findings --scan-id $SCAN_ID --format json
# Filter by status
stella reachability findings --scan-id $SCAN_ID --status REACHABLE_STATIC
# Export for CI gate
stella reachability findings --scan-id $SCAN_ID \
--status REACHABLE_STATIC,REACHABLE_PROVEN \
--format sarif \
--output findings.sarif
4. Explain Queries
4.1 Explain Single Finding
# Via API
curl "https://scanner.example.com/api/v1/scanner/scans/$SCAN_ID/reachability/explain?cve=CVE-2024-1234&purl=pkg:npm/lodash@4.17.20" \
-H "Authorization: Bearer $TOKEN"
# Via CLI
stella reachability explain --scan-id $SCAN_ID \
--cve CVE-2024-1234 \
--purl "pkg:npm/lodash@4.17.20"
# Output:
# Status: REACHABLE_STATIC
# Confidence: 0.70
#
# Shortest Path (depth=3):
# [0] MyApp.Controllers.OrdersController::Get(Guid)
# Entrypoint: HTTP GET /api/orders/{id}
# [1] MyApp.Services.OrderService::Process(Order)
# Edge: static (direct_call)
# [2] Lodash.merge(Object, Object) [VULNERABLE]
# Edge: static (direct_call)
#
# Why Reachable:
# - Static call path exists from HTTP entrypoint
# - All edges are statically proven
# - Vulnerable function is directly invoked
4.2 Explain with Alternatives
# Show all paths (not just shortest)
stella reachability explain --scan-id $SCAN_ID \
--cve CVE-2024-1234 \
--purl "pkg:npm/lodash@4.17.20" \
--all-paths
# Output includes:
# Alternative paths found: 3
# Path 1 (depth=3): ... [shown above]
# Path 2 (depth=5): Controllers.UserController -> ... -> Lodash.merge
# Path 3 (depth=7): Background.JobProcessor -> ... -> Lodash.merge
4.3 Why Unreachable
# Explain why vulnerability is unreachable
stella reachability explain --scan-id $SCAN_ID \
--cve CVE-2024-5678 \
--purl "pkg:npm/unused-lib@1.0.0"
# Output:
# Status: UNREACHABLE
# Confidence: 0.95
#
# Why Unreachable:
# - No path found from any entrypoint
# - Vulnerable function: UnusedLib.dangerousMethod()
# - Function visibility: private
# - Callers found: 0
# - Dead code analysis: likely dead code
4.4 Batch Explain
# Export all reachability explanations
stella reachability explain-all --scan-id $SCAN_ID \
--output explanations.json
# Explain only reachable findings
stella reachability explain-all --scan-id $SCAN_ID \
--status REACHABLE_STATIC,REACHABLE_PROVEN \
--output reachable-explanations.json
5. Troubleshooting
5.1 Call Graph Too Large
Symptom: Upload fails with "413 Payload Too Large".
Diagnosis:
# Check graph size
du -h callgraph.json
wc -l callgraph.json
# Count nodes/edges
jq '.nodes | length' callgraph.json
jq '.edges | length' callgraph.json
Resolution:
# Option 1: Use streaming upload
stella scan graph upload --scan-id $SCAN_ID \
--file callgraph.json \
--streaming
# Option 2: Convert to NDJSON
stella scan graph convert --input callgraph.json \
--output callgraph.ndjson \
--format ndjson
# Option 3: Partition by artifact
stella scan graph partition --input callgraph.json \
--output-dir ./partitions/ \
--by artifact
5.2 Missing Entrypoints
Symptom: "No entrypoints found" warning.
Diagnosis:
# Check entrypoint detection
stella scan graph entrypoints --scan-id $SCAN_ID --verbose
# Check for framework detection
stella scan graph detect-framework --scan-id $SCAN_ID
Common causes:
- Framework not detected: Add framework hints
- Custom entrypoints: Manually specify
- Wrong language worker: Check artifact analysis
Resolution:
# Specify framework explicitly
stella scan graph upload --scan-id $SCAN_ID \
--file callgraph.json \
--framework aspnetcore
# Add custom entrypoints
stella scan graph entrypoint add --scan-id $SCAN_ID \
--node sha256:node123... \
--kind http \
--route "/api/custom"
5.3 Reachability Computation Timeout
Symptom: Job fails with "computation timeout".
Diagnosis:
# Check computation stats
stella reachability job-stats --job-id reachability-job-001
# Output:
# Nodes visited: 500,000
# Edges traversed: 2,500,000
# Time elapsed: 300s
# Memory used: 4.2 GB
Resolution:
# Option 1: Increase timeout
stella reachability compute --scan-id $SCAN_ID --timeout 600s
# Option 2: Reduce depth
stella reachability compute --scan-id $SCAN_ID --max-depth 5
# Option 3: Skip indirect calls
stella reachability compute --scan-id $SCAN_ID --indirect-resolution skip
# Option 4: Partition analysis
stella reachability compute --scan-id $SCAN_ID --partition-by artifact
5.4 Inconsistent Results
Symptom: Different results between runs.
Diagnosis:
# Check determinism settings
stella scan manifest --scan-id $SCAN_ID | jq '.deterministic, .seed'
# Compare graph hashes
stella scan graph hash --scan-id $SCAN_ID
Resolution:
# Ensure deterministic mode
stella reachability compute --scan-id $SCAN_ID \
--deterministic \
--seed "AQIDBA==" # Fixed seed
# Use same graph version
stella reachability compute --scan-id $SCAN_ID \
--graph-digest sha256:cg123...
5.5 False Positives/Negatives
Symptom: Reachability verdict seems incorrect.
Diagnosis:
# Get detailed explanation
stella reachability explain --scan-id $SCAN_ID \
--cve CVE-2024-1234 \
--purl "pkg:npm/lodash@4.17.20" \
--verbose
# Check edge confidence
stella scan graph edge --scan-id $SCAN_ID \
--from sha256:nodeA... \
--to sha256:nodeB...
Common causes for false positives:
- Heuristic edges: Indirect call resolution too aggressive
- Reflection/dynamic calls: May create false paths
- Dead code not detected: Code exists but never executes
Common causes for false negatives:
- Missing edges: Call graph incomplete
- Indirect calls skipped: Resolution too conservative
- Cross-language calls: Language boundary not bridged
Resolution:
# Adjust indirect call resolution
stella reachability compute --scan-id $SCAN_ID \
--indirect-resolution conservative
# Add runtime evidence
stella scan evidence upload --scan-id $SCAN_ID \
--file runtime-trace.json
# Report false positive/negative for ML training
stella reachability feedback --scan-id $SCAN_ID \
--cve CVE-2024-1234 \
--verdict false-positive \
--reason "Dead code - feature flag disabled"
6. Monitoring & Alerting
6.1 Key Metrics
| Metric | Description | Alert Threshold |
|---|---|---|
callgraph_upload_duration_seconds |
Time to upload call graph | > 60s |
callgraph_size_bytes |
Size of uploaded graphs | > 200MB |
reachability_computation_duration_seconds |
Time to compute reachability | > 300s |
reachability_nodes_visited |
Nodes visited during BFS | > 1M |
reachability_job_failures_total |
Failed computation jobs | > 0/hour |
entrypoint_detection_rate |
% of scans with entrypoints | < 90% |
6.2 Grafana Dashboard
Dashboard: Reachability Operations
Panels:
- Call graph upload throughput
- Graph size distribution
- Computation duration (p50, p95, p99)
- Reachability verdict distribution
- Job queue depth
- Entrypoint detection rate
6.3 Alerting Rules
groups:
- name: reachability
rules:
- alert: ReachabilityComputationSlow
expr: histogram_quantile(0.95, reachability_computation_duration_seconds) > 300
for: 10m
labels:
severity: warning
annotations:
summary: "Reachability computation is slow"
- alert: ReachabilityJobFailures
expr: increase(reachability_job_failures_total[1h]) > 5
for: 5m
labels:
severity: critical
annotations:
summary: "Multiple reachability job failures"
- alert: LowEntrypointDetectionRate
expr: entrypoint_detection_rate < 0.8
for: 1h
labels:
severity: warning
annotations:
summary: "Entrypoint detection rate is low"
7. Escalation Procedures
7.1 Escalation Matrix
| Severity | Condition | Response Time | Escalation Path |
|---|---|---|---|
| P1 | Reachability failing for all scans | 15 min | On-call → Team Lead |
| P2 | Computation failures > 20% | 1 hour | On-call → Team Lead |
| P3 | Computation latency > 600s p95 | 4 hours | On-call |
| P4 | Entrypoint detection < 70% | 24 hours | Ticket |
7.2 P1 Response Procedure
- Acknowledge alert
- Triage:
# Check worker health stella scanner workers status # Check graph store connectivity stella health check --service graph-store # Check recent failures stella reachability jobs --status failed --last 10 - Mitigate:
# Scale up workers if queue backlog kubectl scale deployment scanner-worker --replicas=10 # Clear stuck jobs stella reachability jobs cancel --status stuck - Communicate: Update status page
- Resolve: Fix root cause
- Postmortem: Document within 48 hours
Related Documentation
Last Updated: 2025-12-20
Version: 1.0.0
Sprint: 3500.0004.0004