# Reachability Drift Detection - Architecture **Module:** Scanner **Version:** 1.0 **Status:** Implemented (Sprint 3600.2-3600.3) **Last Updated:** 2025-12-22 --- ## 1. Overview Reachability Drift Detection tracks function-level reachability changes between scans to identify when code modifications create new paths to vulnerable sinks or mitigate existing risks. This enables security teams to: - **Detect regressions** when previously unreachable vulnerabilities become exploitable - **Validate fixes** by confirming vulnerable code paths are removed - **Prioritize triage** based on actual exploitability rather than theoretical risk - **Automate VEX** by generating evidence-backed justifications --- ## 2. Key Concepts ### 2.1 Call Graph A directed graph representing function/method call relationships in source code: - **Nodes**: Functions, methods, lambdas with metadata (file, line, visibility) - **Edges**: Call relationships with call kind (direct, virtual, delegate, reflection, dynamic) - **Entrypoints**: Public-facing functions (HTTP handlers, CLI commands, message consumers) - **Sinks**: Security-sensitive APIs (command execution, SQL, file I/O, deserialization) ### 2.2 Reachability Analysis Multi-source BFS traversal from entrypoints to determine which sinks are exploitable: ``` Entrypoints (HTTP handlers, CLI) │ ▼ BFS traversal [Application Code] │ ▼ Sinks (exec, query, writeFile) │ ▼ Reachable = TRUE if path exists ``` ### 2.3 Drift Detection Compares reachability between two scans (base vs head): | Transition | Direction | Risk Impact | |------------|-----------|-------------| | Unreachable → Reachable | `became_reachable` | **Increased** - New exploit path | | Reachable → Unreachable | `became_unreachable` | **Decreased** - Mitigation applied | ### 2.4 Cause Attribution Explains *why* drift occurred by correlating with code changes: | Cause Kind | Description | Example | |------------|-------------|---------| | `guard_removed` | Conditional check removed | `if (!authorized)` deleted | | `guard_added` | New conditional blocks path | Added null check | | `new_public_route` | New entrypoint created | Added `/api/admin` endpoint | | `visibility_escalated` | Internal → Public | Method made public | | `dependency_upgraded` | Library update changed behavior | lodash 4.x → 5.x | | `symbol_removed` | Function deleted | Removed vulnerable helper | | `unknown` | Cannot determine | Multiple simultaneous changes | --- ## 3. Data Flow ```mermaid flowchart TD subgraph Scan["Scan Execution"] A[Source Code] --> B[Call Graph Extractor] B --> C[CallGraphSnapshot] end subgraph Analysis["Drift Analysis"] C --> D[Reachability Analyzer] D --> E[ReachabilityResult] F[Base Scan Graph] --> G[Drift Detector] E --> G H[Code Changes] --> G G --> I[ReachabilityDriftResult] end subgraph Output["Output"] I --> J[Path Compressor] J --> K[Compressed Paths] I --> L[Cause Explainer] L --> M[Drift Causes] K --> N[Storage/API] M --> N end subgraph Integration["Integration"] N --> O[Policy Gates] N --> P[VEX Emission] N --> Q[Web UI] end ``` --- ## 4. Component Architecture ### 4.1 Call Graph Extractors Per-language AST analysis producing `CallGraphSnapshot`: | Language | Extractor | Technology | Status | |----------|-----------|------------|--------| | .NET | `DotNetCallGraphExtractor` | Roslyn semantic model | **Done** | | Java | `JavaCallGraphExtractor` | ASM bytecode analysis | **Done** | | Go | `GoCallGraphExtractor` | golang.org/x/tools SSA | **Done** | | Python | `PythonCallGraphExtractor` | Python AST | **Done** | | Node.js | `NodeCallGraphExtractor` | Babel (planned) | Skeleton | | PHP | `PhpCallGraphExtractor` | php-parser | **Done** | | Ruby | `RubyCallGraphExtractor` | parser gem | **Done** | **Location:** `src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Extraction/` ### 4.2 Reachability Analyzer Multi-source BFS from entrypoints to sinks: ```csharp public sealed class ReachabilityAnalyzer { public ReachabilityResult Analyze(CallGraphSnapshot graph); } public record ReachabilityResult { ImmutableHashSet ReachableNodes { get; } ImmutableArray ReachableSinks { get; } ImmutableDictionary> ShortestPaths { get; } } ``` **Location:** `src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Analysis/` ### 4.3 Drift Detector Compares base and head graphs: ```csharp public sealed class ReachabilityDriftDetector { public ReachabilityDriftResult Detect( CallGraphSnapshot baseGraph, CallGraphSnapshot headGraph, IReadOnlyList codeChanges); } ``` **Location:** `src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/` ### 4.4 Path Compressor Reduces full paths to key nodes for storage/display: ``` Full Path (20 nodes): entrypoint → A → B → C → ... → X → Y → sink Compressed Path: entrypoint → [changed: B] → [changed: X] → sink (intermediateCount: 17) ``` **Location:** `src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/PathCompressor.cs` ### 4.5 Cause Explainer Correlates drift with code changes: ```csharp public sealed class DriftCauseExplainer { public DriftCause Explain(...); public DriftCause ExplainUnreachable(...); } ``` **Location:** `src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/DriftCauseExplainer.cs` --- ## 5. Language Support Matrix | Feature | .NET | Java | Go | Python | Node.js | PHP | Ruby | |---------|------|------|-------|--------|---------|-----|------| | Function extraction | Yes | Yes | Yes | Yes | Partial | Yes | Yes | | Call edge extraction | Yes | Yes | Yes | Yes | Partial | Yes | Yes | | HTTP entrypoints | ASP.NET | Spring | net/http | Flask/Django | Express* | Laravel | Rails | | gRPC entrypoints | Yes | Yes | Yes | Yes | No | No | No | | CLI entrypoints | Yes | Yes | Yes | Yes | Partial | Yes | Yes | | Sink detection | Yes | Yes | Yes | Yes | Partial | Yes | Yes | *Requires Sprint 3600.4 completion --- ## 6. Storage Schema ### 6.1 PostgreSQL Tables **call_graph_snapshots:** ```sql CREATE TABLE call_graph_snapshots ( id UUID PRIMARY KEY, tenant_id UUID NOT NULL, scan_id TEXT NOT NULL, language TEXT NOT NULL, graph_digest TEXT NOT NULL, node_count INT NOT NULL, edge_count INT NOT NULL, entrypoint_count INT NOT NULL, sink_count INT NOT NULL, extracted_at TIMESTAMPTZ NOT NULL, snapshot_json JSONB NOT NULL ); ``` **reachability_drift_results:** ```sql CREATE TABLE reachability_drift_results ( id UUID PRIMARY KEY, tenant_id UUID NOT NULL, base_scan_id TEXT NOT NULL, head_scan_id TEXT NOT NULL, language TEXT NOT NULL, newly_reachable_count INT NOT NULL, newly_unreachable_count INT NOT NULL, detected_at TIMESTAMPTZ NOT NULL, result_digest TEXT NOT NULL ); ``` **drifted_sinks:** ```sql CREATE TABLE drifted_sinks ( id UUID PRIMARY KEY, tenant_id UUID NOT NULL, drift_result_id UUID NOT NULL REFERENCES reachability_drift_results(id), sink_node_id TEXT NOT NULL, symbol TEXT NOT NULL, sink_category TEXT NOT NULL, direction TEXT NOT NULL, cause_kind TEXT NOT NULL, cause_description TEXT NOT NULL, compressed_path JSONB NOT NULL, associated_vulns JSONB ); ``` **code_changes:** ```sql CREATE TABLE code_changes ( id UUID PRIMARY KEY, tenant_id UUID NOT NULL, scan_id TEXT NOT NULL, base_scan_id TEXT NOT NULL, language TEXT NOT NULL, file TEXT NOT NULL, symbol TEXT NOT NULL, change_kind TEXT NOT NULL, details JSONB, detected_at TIMESTAMPTZ NOT NULL ); ``` ### 6.2 Valkey Caching ``` stella:callgraph:{scan_id}:{lang}:{digest} → Compressed CallGraphSnapshot stella:callgraph:{scan_id}:{lang}:reachable → Set of reachable sink IDs stella:callgraph:{scan_id}:{lang}:paths:{sink} → Shortest path to sink ``` TTL: Configurable (default 24h) Circuit breaker: 5 failures → 30s timeout --- ## 7. API Endpoints | Method | Path | Description | |--------|------|-------------| | GET | `/scans/{scanId}/drift` | Get drift results for a scan | | GET | `/drift/{driftId}/sinks` | List drifted sinks (paginated) | | POST | `/scans/{scanId}/compute-reachability` | Trigger reachability computation | | GET | `/scans/{scanId}/reachability/components` | List components with reachability | | GET | `/scans/{scanId}/reachability/findings` | Get reachable vulnerable sinks | | GET | `/scans/{scanId}/reachability/explain` | Explain why a sink is reachable | See: `docs/api/scanner-drift-api.md` --- ## 8. Integration Points ### 8.1 Policy Module Drift results feed into policy gates for CI/CD blocking: ```yaml smart_diff: gates: - condition: "delta_reachable > 0 AND is_kev = true" action: block ``` ### 8.2 VEX Emission Automatic VEX candidate generation on drift: | Drift Direction | VEX Status | Justification | |-----------------|------------|---------------| | became_unreachable | `not_affected` | `vulnerable_code_not_in_execute_path` | | became_reachable | — | Requires manual review | ### 8.3 Attestation DSSE-signed drift attestations: ```json { "_type": "https://in-toto.io/Statement/v1", "predicateType": "stellaops.dev/predicates/reachability-drift@v1", "predicate": { "baseScanId": "abc123", "headScanId": "def456", "newlyReachable": [...], "newlyUnreachable": [...], "resultDigest": "sha256:..." } } ``` --- ## 9. Performance Characteristics | Metric | Target | Notes | |--------|--------|-------| | Graph extraction (100K LOC) | < 60s | Per language | | Reachability analysis | < 5s | BFS traversal | | Drift detection | < 10s | Graph comparison | | Memory usage | < 2GB | Large projects | | Cache hit improvement | 10x | Valkey lookup vs recompute | --- ## 10. References - **Implementation Sprints:** - `docs/implplan/SPRINT_3600_0002_0001_call_graph_infrastructure.md` - `docs/implplan/SPRINT_3600_0003_0001_drift_detection_engine.md` - **API Reference:** `docs/api/scanner-drift-api.md` - **Operations Guide:** `docs/operations/reachability-drift-guide.md` - **Original Advisory:** `docs/product-advisories/archived/17-Dec-2025 - Reachability Drift Detection.md` - **Source Code:** `src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/`