# Reachability Drift Detection - Architecture **Module:** Scanner **Version:** 1.0 **Status:** Implemented (core drift engine + API; Node Babel integration pending) **Last Updated:** 2025-12-22 --- ## 1. Overview Reachability Drift Detection tracks function-level reachability changes between scans. It highlights when code changes create new paths to sensitive sinks or remove existing paths, producing deterministic evidence for triage and VEX workflows. Key outcomes: - Detect regressions when previously unreachable sinks become reachable. - Validate mitigations when reachable sinks become unreachable. - Provide deterministic evidence for audit and policy decisions. --- ## 2. Key Concepts ### 2.1 Call Graph A directed graph of function calls: - Nodes: functions, methods, lambdas with file and line metadata. - Edges: call relationships (direct, virtual, dynamic). - Entrypoints: public handlers (HTTP, CLI, background services). - Sinks: security-sensitive APIs from the sink registry. ### 2.2 Reachability Analysis Multi-source traversal from entrypoints to sinks to determine exploitability. ### 2.3 Drift Detection Compares reachability between base and head scans: - `became_reachable`: risk increased (new path to sink). - `became_unreachable`: risk decreased (path removed or mitigated). ### 2.4 Cause Attribution Explains why drift happened by correlating code changes with paths. --- ## 3. Data Flow ```mermaid flowchart TD A[Source or binary] --> B[Call graph extractor] B --> C[CallGraphSnapshot] C --> D[Reachability analyzer] D --> E[ReachabilityResult] C --> F[Code change extractor] E --> G[ReachabilityDriftDetector] F --> G G --> H[ReachabilityDriftResult] H --> I[Storage + API] ``` --- ## 4. Component Architecture ### 4.1 Call Graph Extractors Registered extractors are configured in `CallGraphServiceCollectionExtensions`. | Language | Extractor | Status | Notes | |---|---|---|---| | .NET | `DotNetCallGraphExtractor` | Registered | Roslyn semantic model. | | Node.js | `NodeCallGraphExtractor` | Registered (placeholder) | Trace-based fallback; Babel integration pending (Sprint 3600.0004). | | Java | `JavaCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Go | `GoCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Python | `PythonCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | PHP | `PhpCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Ruby | `RubyCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | JavaScript | `JavaScriptCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Bun | `BunCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Deno | `DenoCallGraphExtractor` | Library present, not wired | Register extractor to enable. | | Binary | `BinaryCallGraphExtractor` | Library present, not wired | Native call edge extraction. | ### 4.2 Reachability Analyzer Located in `src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Analysis/`. ### 4.3 Drift Detector `ReachabilityDriftDetector` compares base and head snapshots and produces `ReachabilityDriftResult` with compressed paths. ### 4.4 Path Compressor and Cause Explainer - `PathCompressor` reduces paths to key nodes and optionally includes full paths. - `DriftCauseExplainer` correlates changes to explain why drift happened. --- ## 5. Language Support Matrix | Capability | .NET | Node.js | Others (Java/Go/Python/PHP/Ruby/JS/Bun/Deno/Binary) | |---|---|---|---| | Call graph extraction | Supported | Placeholder | Library present, not wired | | Entrypoint detection | Supported | Partial | Library present, not wired | | Sink detection | Supported | Partial | Library present, not wired | --- ## 6. Storage Schema Migrations are in `src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/`. Core tables: - `call_graph_snapshots`: `scan_id`, `language`, `graph_digest`, `extracted_at`, `node_count`, `edge_count`, `entrypoint_count`, `sink_count`, `snapshot_json`. - `reachability_results`: `scan_id`, `language`, `graph_digest`, `result_digest`, `computed_at`, `reachable_node_count`, `reachable_sink_count`, `result_json`. - `code_changes`: `scan_id`, `base_scan_id`, `language`, `node_id`, `file`, `symbol`, `change_kind`, `details`, `detected_at`. - `reachability_drift_results`: `base_scan_id`, `head_scan_id`, `language`, `newly_reachable_count`, `newly_unreachable_count`, `detected_at`, `result_digest`. - `drifted_sinks`: `drift_result_id`, `sink_node_id`, `sink_category`, `direction`, `cause_kind`, `cause_description`, `compressed_path`, `associated_vulns`. - `material_risk_changes`: extended with `base_scan_id`, `cause`, `cause_kind`, `path_nodes`, `associated_vulns` for drift attachments. --- ## 7. Cache and Determinism If the call graph cache is enabled (`CallGraph:Cache`), cached keys follow this pattern: - `callgraph:graph:{scanId}:{language}` - `callgraph:reachability:{scanId}:{language}` Determinism is enforced by stable ordering and deterministic IDs (see `DeterministicIds`). --- ## 8. API Endpoints Base path: `/api/v1` | Method | Path | Description | |---|---|---| | GET | `/scans/{scanId}/drift` | Get or compute drift results for a scan. | | GET | `/drift/{driftId}/sinks` | List drifted sinks (paged). | | POST | `/scans/{scanId}/compute-reachability` | Trigger reachability computation. | | GET | `/scans/{scanId}/reachability/components` | List components with reachability. | | GET | `/scans/{scanId}/reachability/findings` | List findings with reachability. | | GET | `/scans/{scanId}/reachability/explain` | Explain reachability for a CVE and PURL. | See `docs/api/scanner-drift-api.md` for details. --- ## 9. Integration Points - Policy gates: planned in `SPRINT_3600_0005_0001_policy_ci_gate_integration.md`. - VEX candidate emission: planned alongside policy gates. - Attestation: `StellaOps.Scanner.ReachabilityDrift.Attestation` provides DSSE signing utilities (integration is optional). --- ## 10. Performance Characteristics (Targets) | Metric | Target | Notes | |---|---|---| | Call graph extraction (100K LOC) | < 60s | Per language extractor. | | Reachability analysis | < 5s | BFS traversal on trimmed graphs. | | Drift detection | < 10s | Graph comparison and compression. | | Cache hit improvement | 10x | Valkey cache vs recompute. | --- ## 11. References - `docs/implplan/archived/SPRINT_3600_0002_0001_call_graph_infrastructure.md` - `docs/implplan/archived/SPRINT_3600_0003_0001_drift_detection_engine.md` - `docs/api/scanner-drift-api.md` - `docs/operations/reachability-drift-guide.md` - `docs/product-advisories/archived/17-Dec-2025 - Reachability Drift Detection.md` - `src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/`