git.stella-ops.org/docs/training/reachability-concept-guide.md

# Reachability Analysis Concept Guide

**Sprint:** SPRINT_3500_0004_0004
**Audience:** Developers, Security Engineers, DevOps

## Introduction

Reachability Analysis determines whether vulnerable code can actually be reached during program execution. This guide explains how StellaOps uses call graphs, BFS traversal, and confidence scoring to separate actionable vulnerabilities from noise.

---

## The Problem: Alert Fatigue

Traditional vulnerability scanners report every known CVE in your dependencies:

```
❌ CVE-2024-1234 in lodash@4.17.20 (CRITICAL)
❌ CVE-2024-5678 in express@4.18.0 (HIGH)
❌ CVE-2024-9012 in moment@2.29.0 (MEDIUM)
... 247 more findings
```

**The reality:**
- 80-90% of reported vulnerabilities are **unreachable**
- Teams waste time investigating false positives
- Real risks get lost in the noise
- Security fatigue leads to ignored alerts

---

## The Solution: Reachability Analysis

StellaOps analyzes your application's **call graph** to determine if vulnerable functions are actually invoked:

```
✅ CVE-2024-1234 in lodash@4.17.20 - UNREACHABLE (safe to ignore)
⚠️  CVE-2024-5678 in express@4.18.0 - POSSIBLY_REACHABLE (review)
🔴 CVE-2024-9012 in moment@2.29.0 - REACHABLE_STATIC (fix required)
```

Result: Focus on the 10-20% that actually matter.

---

## Core Concepts

### 1. Call Graph

A **Call Graph** represents function calls in your application:

```
┌─────────────────────────────────────────────────────────────┐
│                       Your Application                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌────────────────┐                                         │
│  │ HTTP Endpoint  │  ← Entrypoint                          │
│  │ /api/orders    │                                         │
│  └───────┬────────┘                                         │
│          │ calls                                             │
│          ▼                                                   │
│  ┌────────────────┐      ┌────────────────┐                │
│  │ OrderService   │─────▶│ PaymentService │                │
│  │ .processOrder()│      │ .charge()      │                │
│  └───────┬────────┘      └────────────────┘                │
│          │ calls                                             │
│          ▼                                                   │
│  ┌────────────────┐                                         │
│  │ lodash.merge() │  ← Vulnerable function                 │
│  │ (CVE-2024-1234)│                                         │
│  └────────────────┘                                         │
│                                                              │
└─────────────────────────────────────────────────────────────┘
```

**Components:**
- **Nodes**: Functions, methods, classes
- **Edges**: Call relationships
- **Entrypoints**: Where execution begins (HTTP routes, CLI commands, etc.)

### 2. Entrypoints

**Entrypoints** are where external input enters your application:

| Kind | Examples |
|------|----------|
| HTTP | `GET /api/orders`, `POST /users` |
| gRPC | `OrderService.GetOrder` |
| Message Queue | `orders.created` consumer |
| CLI | `./app --process-file` |
| Scheduled | Cron job, background worker |

### 3. Reachability Status

Each vulnerability gets one of these statuses:

| Status | Meaning | Action |
|--------|---------|--------|
| `UNREACHABLE` | No path from any entrypoint | Safe to ignore |
| `POSSIBLY_REACHABLE` | Path exists via indirect/heuristic edges | Review |
| `REACHABLE_STATIC` | Direct static path exists | Prioritize fix |
| `REACHABLE_PROVEN` | Runtime trace confirms execution | Fix immediately |
| `UNKNOWN` | Insufficient call graph data | Investigate |

### 4. Edge Types

Call graph edges have different confidence levels:

| Edge Type | Confidence | Description |
|-----------|------------|-------------|
| `direct_call` | High | Static function call |
| `virtual_dispatch` | Medium | Interface/virtual method |
| `reflection` | Low | Reflection-based call |
| `dynamic` | Low | Dynamic dispatch |
| `heuristic` | Very Low | Inferred relationship |

### 5. Confidence Score

**Confidence** quantifies how certain we are about reachability (0.0 to 1.0):

```
Confidence = weighted_sum([
  staticPathExists     × 0.50,
  allEdgesStatic       × 0.20,
  noReflection         × 0.10,
  runtimeConfirmed     × 0.15,
  symbolResolved       × 0.05
])
```

Example:
- Static path exists: +0.50
- All edges are direct calls: +0.20
- No reflection: +0.10
- Not runtime confirmed: +0.00
- All symbols resolved: +0.05
- **Total: 0.85**

---

## How It Works

### Step 1: Call Graph Generation

Your build system generates a call graph using one of these approaches:

**Build-time extraction** (most accurate):
```bash
# .NET (roslyn)
dotnet build --generate-call-graph

# Java (gradle plugin)
./gradlew generateCallGraph

# Node.js (static analysis)
npx @stellaops/callgraph-generator .
```

**Upload to StellaOps**:
```bash
stella scan graph upload --scan-id $SCAN_ID --file callgraph.json
```

### Step 2: Entrypoint Detection

StellaOps identifies entrypoints automatically:

```json
{
  "entrypoints": [
    {
      "kind": "http",
      "route": "GET /api/orders/{id}",
      "method": "MyApp.Controllers.OrdersController::Get",
      "framework": "aspnetcore"
    },
    {
      "kind": "grpc",
      "service": "OrderService",
      "method": "MyApp.Services.OrderGrpcService::GetOrder",
      "framework": "grpc-dotnet"
    }
  ]
}
```

### Step 3: BFS Traversal

For each vulnerability, BFS finds paths from entrypoints:

```
Queue: [HTTP /api/orders → OrdersController::Get]

Step 1: Visit OrdersController::Get
        → Neighbors: [OrderService::Process, Logger::Log]
        → Add to queue: OrderService::Process, Logger::Log

Step 2: Visit OrderService::Process
        → Neighbors: [Lodash::merge (VULNERABLE!)]
        → PATH FOUND! Depth = 2

Result: REACHABLE_STATIC
Path: /api/orders → OrdersController::Get → OrderService::Process → Lodash::merge
```

### Step 4: Confidence Calculation

Based on the path quality:

```yaml
path:
  - node: OrdersController::Get
    edge_type: entrypoint
  - node: OrderService::Process
    edge_type: direct_call    # +0.50 static
  - node: Lodash::merge
    edge_type: direct_call    # +0.20 all static

factors:
  staticPathExists: 0.50
  allEdgesStatic: 0.20
  noReflection: 0.10
  runtimeConfirmed: 0.00
  symbolResolved: 0.05

confidence: 0.85
```

---

## Understanding Results

### Explain Query

Get a detailed explanation for any finding:

```bash
stella reachability explain \
  --scan-id $SCAN_ID \
  --cve CVE-2024-1234 \
  --purl "pkg:npm/lodash@4.17.20"
```

**Output:**
```
Status: REACHABLE_STATIC
Confidence: 0.85

Shortest Path (depth=2):
[0] MyApp.Controllers.OrdersController::Get(Guid)
    Entrypoint: HTTP GET /api/orders/{id}
[1] MyApp.Services.OrderService::Process(Order)
    Edge: static (direct_call)
[2] Lodash.merge(Object, Object) [VULNERABLE]
    Edge: static (direct_call)

Why Reachable:
- Static call path exists from HTTP entrypoint
- All edges are statically proven (no heuristics)
- Vulnerable function is directly invoked

Confidence Factors:
  staticPathExists: +0.50
  allEdgesStatic: +0.20
  noReflection: +0.10
  runtimeConfirmed: +0.00
  symbolResolved: +0.05
```

### Interpreting Status

| Status | What it means | What to do |
|--------|---------------|------------|
| `UNREACHABLE` | No code path calls the vulnerable function | Safe to deprioritize; track for visibility |
| `POSSIBLY_REACHABLE` | Path exists but involves heuristics | Review the path; add call graph data if missing |
| `REACHABLE_STATIC` | Static analysis proves reachability | Prioritize remediation |
| `REACHABLE_PROVEN` | Runtime data confirms execution | Fix immediately; exploitability confirmed |
| `UNKNOWN` | Call graph incomplete | Improve call graph coverage |

---

## Best Practices

### 1. Generate Complete Call Graphs

Incomplete call graphs lead to `UNKNOWN` status:

```bash
# Check call graph completeness
stella scan graph summary --scan-id $SCAN_ID

# Output:
# Nodes: 12,345 (expected: ~15,000 for project size)
# Coverage: 82%
# Orphan nodes: 234
```

**Tips for better coverage:**
- Include all modules in build
- Enable whole-program analysis
- Include test code (may reveal paths)

### 2. Review `POSSIBLY_REACHABLE` Findings

These often indicate:
- Reflection use
- Dynamic dispatch
- Framework magic (DI, AOP)

```bash
# Get details
stella reachability explain \
  --scan-id $SCAN_ID \
  --cve CVE-2024-5678 \
  --all-paths
```

### 3. Add Runtime Evidence

Runtime traces increase confidence:

```bash
# Enable runtime instrumentation
stella scan run \
  --image $IMAGE \
  --include-runtime \
  --runtime-profile production-traces.json
```

### 4. Handle `UNKNOWN` Appropriately

Don't ignore unknowns—they represent gaps:

```bash
# List unknowns
stella reachability findings --scan-id $SCAN_ID --status UNKNOWN

# Common causes:
# - External library without call graph
# - Native code (FFI)
# - Dynamic languages without type info
```

### 5. Integrate with CI/CD

```yaml
# Example GitHub Actions
- name: Run reachability scan
  run: |
    stella scan run --image $IMAGE --reachability enabled

- name: Check for reachable vulnerabilities
  run: |
    # Fail if any HIGH+ CVE is reachable
    REACHABLE=$(stella reachability findings \
      --scan-id $SCAN_ID \
      --status REACHABLE_STATIC,REACHABLE_PROVEN \
      --output-format json | jq 'length')

    if [ "$REACHABLE" -gt 0 ]; then
      echo "Found $REACHABLE reachable vulnerabilities!"
      exit 1
    fi
```

---

## Call Graph Formats

### Supported Formats

| Format | Extension | Use Case |
|--------|-----------|----------|
| JSON | `.json` | Standard interchange |
| NDJSON | `.ndjson` | Large graphs (streaming) |
| DOT | `.dot` | Visualization |
| Custom | `.cg` | StellaOps native |

### JSON Schema

```json
{
  "version": "1.0",
  "language": "dotnet",
  "nodes": [
    {
      "id": "sha256:abc123...",
      "symbol": "MyApp.Services.OrderService::Process",
      "kind": "method",
      "location": {
        "file": "Services/OrderService.cs",
        "line": 42
      }
    }
  ],
  "edges": [
    {
      "source": "sha256:abc123...",
      "target": "sha256:def456...",
      "type": "direct_call",
      "location": {
        "file": "Services/OrderService.cs",
        "line": 55
      }
    }
  ],
  "entrypoints": [
    {
      "nodeId": "sha256:ghi789...",
      "kind": "http",
      "route": "GET /api/orders/{id}"
    }
  ]
}
```

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                    Reachability Analysis System                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐           │
│  │ Call Graph  │──▶│ Entrypoint  │──▶│ Reachability│           │
│  │   Parser    │   │  Detector   │   │   Engine    │           │
│  └─────────────┘   └─────────────┘   └─────────────┘           │
│         │                │                  │                   │
│         ▼                ▼                  ▼                   │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐           │
│  │   Graph     │   │  Framework  │   │   Path      │           │
│  │   Store     │   │  Adapters   │   │   Cache     │           │
│  └─────────────┘   └─────────────┘   └─────────────┘           │
│                                                                  │
│  Symbol Resolution                                               │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐           │
│  │   CVE →     │──▶│   Symbol    │──▶│   Node      │           │
│  │   Function  │   │   Matcher   │   │   Lookup    │           │
│  └─────────────┘   └─────────────┘   └─────────────┘           │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

---

## Troubleshooting

### "Too many UNKNOWN findings"

**Cause**: Incomplete call graph

**Solution**:
```bash
# Check coverage
stella scan graph summary --scan-id $SCAN_ID

# Regenerate with more options
# For .NET:
dotnet build --generate-call-graph --whole-program
```

### "False UNREACHABLE"

**Cause**: Missing edge (reflection, dynamic dispatch)

**Solution**:
```bash
# Check for known patterns
stella scan graph validate --scan-id $SCAN_ID

# Add hints for reflection patterns
stella scan run --reflection-hints reflection-config.json
```

### "Computation timeout"

**Cause**: Large graph, deep paths

**Solution**:
```bash
# Increase timeout
stella reachability compute --scan-id $SCAN_ID --timeout 600s

# Or limit depth
stella reachability compute --scan-id $SCAN_ID --max-depth 15
```

---

## Related Documentation

- [Reachability CLI Reference](../cli/reachability-cli-reference.md)
- [Reachability API Reference](../api/score-proofs-reachability-api-reference.md)
- [Reachability Runbook](../operations/reachability-runbook.md)
- [Score Proofs Concept Guide](./score-proofs-concept-guide.md)

---

**Last Updated**: 2025-12-20
**Version**: 1.0.0
**Sprint**: 3500.0004.0004