- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling. - Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings. - Create ApkVersion for parsing Alpine APK version strings with suffix support. - Define IVersionComparator interface for version comparison with proof-line generation. - Implement VersionComparisonResult struct to encapsulate comparison results and proof lines. - Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling. - Create project files for the version comparison library and its tests.
10 KiB
Reachability Drift Detection - Architecture
Module: Scanner Version: 1.0 Status: Implemented (Sprint 3600.2-3600.3) Last Updated: 2025-12-22
1. Overview
Reachability Drift Detection tracks function-level reachability changes between scans to identify when code modifications create new paths to vulnerable sinks or mitigate existing risks. This enables security teams to:
- Detect regressions when previously unreachable vulnerabilities become exploitable
- Validate fixes by confirming vulnerable code paths are removed
- Prioritize triage based on actual exploitability rather than theoretical risk
- Automate VEX by generating evidence-backed justifications
2. Key Concepts
2.1 Call Graph
A directed graph representing function/method call relationships in source code:
- Nodes: Functions, methods, lambdas with metadata (file, line, visibility)
- Edges: Call relationships with call kind (direct, virtual, delegate, reflection, dynamic)
- Entrypoints: Public-facing functions (HTTP handlers, CLI commands, message consumers)
- Sinks: Security-sensitive APIs (command execution, SQL, file I/O, deserialization)
2.2 Reachability Analysis
Multi-source BFS traversal from entrypoints to determine which sinks are exploitable:
Entrypoints (HTTP handlers, CLI)
│
▼ BFS traversal
[Application Code]
│
▼
Sinks (exec, query, writeFile)
│
▼
Reachable = TRUE if path exists
2.3 Drift Detection
Compares reachability between two scans (base vs head):
| Transition | Direction | Risk Impact |
|---|---|---|
| Unreachable → Reachable | became_reachable |
Increased - New exploit path |
| Reachable → Unreachable | became_unreachable |
Decreased - Mitigation applied |
2.4 Cause Attribution
Explains why drift occurred by correlating with code changes:
| Cause Kind | Description | Example |
|---|---|---|
guard_removed |
Conditional check removed | if (!authorized) deleted |
guard_added |
New conditional blocks path | Added null check |
new_public_route |
New entrypoint created | Added /api/admin endpoint |
visibility_escalated |
Internal → Public | Method made public |
dependency_upgraded |
Library update changed behavior | lodash 4.x → 5.x |
symbol_removed |
Function deleted | Removed vulnerable helper |
unknown |
Cannot determine | Multiple simultaneous changes |
3. Data Flow
flowchart TD
subgraph Scan["Scan Execution"]
A[Source Code] --> B[Call Graph Extractor]
B --> C[CallGraphSnapshot]
end
subgraph Analysis["Drift Analysis"]
C --> D[Reachability Analyzer]
D --> E[ReachabilityResult]
F[Base Scan Graph] --> G[Drift Detector]
E --> G
H[Code Changes] --> G
G --> I[ReachabilityDriftResult]
end
subgraph Output["Output"]
I --> J[Path Compressor]
J --> K[Compressed Paths]
I --> L[Cause Explainer]
L --> M[Drift Causes]
K --> N[Storage/API]
M --> N
end
subgraph Integration["Integration"]
N --> O[Policy Gates]
N --> P[VEX Emission]
N --> Q[Web UI]
end
4. Component Architecture
4.1 Call Graph Extractors
Per-language AST analysis producing CallGraphSnapshot:
| Language | Extractor | Technology | Status |
|---|---|---|---|
| .NET | DotNetCallGraphExtractor |
Roslyn semantic model | Done |
| Java | JavaCallGraphExtractor |
ASM bytecode analysis | Done |
| Go | GoCallGraphExtractor |
golang.org/x/tools SSA | Done |
| Python | PythonCallGraphExtractor |
Python AST | Done |
| Node.js | NodeCallGraphExtractor |
Babel (planned) | Skeleton |
| PHP | PhpCallGraphExtractor |
php-parser | Done |
| Ruby | RubyCallGraphExtractor |
parser gem | Done |
Location: src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Extraction/
4.2 Reachability Analyzer
Multi-source BFS from entrypoints to sinks:
public sealed class ReachabilityAnalyzer
{
public ReachabilityResult Analyze(CallGraphSnapshot graph);
}
public record ReachabilityResult
{
ImmutableHashSet<string> ReachableNodes { get; }
ImmutableArray<string> ReachableSinks { get; }
ImmutableDictionary<string, ImmutableArray<string>> ShortestPaths { get; }
}
Location: src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Analysis/
4.3 Drift Detector
Compares base and head graphs:
public sealed class ReachabilityDriftDetector
{
public ReachabilityDriftResult Detect(
CallGraphSnapshot baseGraph,
CallGraphSnapshot headGraph,
IReadOnlyList<CodeChangeFact> codeChanges);
}
Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/
4.4 Path Compressor
Reduces full paths to key nodes for storage/display:
Full Path (20 nodes):
entrypoint → A → B → C → ... → X → Y → sink
Compressed Path:
entrypoint → [changed: B] → [changed: X] → sink
(intermediateCount: 17)
Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/PathCompressor.cs
4.5 Cause Explainer
Correlates drift with code changes:
public sealed class DriftCauseExplainer
{
public DriftCause Explain(...);
public DriftCause ExplainUnreachable(...);
}
Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/DriftCauseExplainer.cs
5. Language Support Matrix
| Feature | .NET | Java | Go | Python | Node.js | PHP | Ruby |
|---|---|---|---|---|---|---|---|
| Function extraction | Yes | Yes | Yes | Yes | Partial | Yes | Yes |
| Call edge extraction | Yes | Yes | Yes | Yes | Partial | Yes | Yes |
| HTTP entrypoints | ASP.NET | Spring | net/http | Flask/Django | Express* | Laravel | Rails |
| gRPC entrypoints | Yes | Yes | Yes | Yes | No | No | No |
| CLI entrypoints | Yes | Yes | Yes | Yes | Partial | Yes | Yes |
| Sink detection | Yes | Yes | Yes | Yes | Partial | Yes | Yes |
*Requires Sprint 3600.4 completion
6. Storage Schema
6.1 PostgreSQL Tables
call_graph_snapshots:
CREATE TABLE call_graph_snapshots (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL,
scan_id TEXT NOT NULL,
language TEXT NOT NULL,
graph_digest TEXT NOT NULL,
node_count INT NOT NULL,
edge_count INT NOT NULL,
entrypoint_count INT NOT NULL,
sink_count INT NOT NULL,
extracted_at TIMESTAMPTZ NOT NULL,
snapshot_json JSONB NOT NULL
);
reachability_drift_results:
CREATE TABLE reachability_drift_results (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL,
base_scan_id TEXT NOT NULL,
head_scan_id TEXT NOT NULL,
language TEXT NOT NULL,
newly_reachable_count INT NOT NULL,
newly_unreachable_count INT NOT NULL,
detected_at TIMESTAMPTZ NOT NULL,
result_digest TEXT NOT NULL
);
drifted_sinks:
CREATE TABLE drifted_sinks (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL,
drift_result_id UUID NOT NULL REFERENCES reachability_drift_results(id),
sink_node_id TEXT NOT NULL,
symbol TEXT NOT NULL,
sink_category TEXT NOT NULL,
direction TEXT NOT NULL,
cause_kind TEXT NOT NULL,
cause_description TEXT NOT NULL,
compressed_path JSONB NOT NULL,
associated_vulns JSONB
);
code_changes:
CREATE TABLE code_changes (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL,
scan_id TEXT NOT NULL,
base_scan_id TEXT NOT NULL,
language TEXT NOT NULL,
file TEXT NOT NULL,
symbol TEXT NOT NULL,
change_kind TEXT NOT NULL,
details JSONB,
detected_at TIMESTAMPTZ NOT NULL
);
6.2 Valkey Caching
stella:callgraph:{scan_id}:{lang}:{digest} → Compressed CallGraphSnapshot
stella:callgraph:{scan_id}:{lang}:reachable → Set of reachable sink IDs
stella:callgraph:{scan_id}:{lang}:paths:{sink} → Shortest path to sink
TTL: Configurable (default 24h) Circuit breaker: 5 failures → 30s timeout
7. API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /scans/{scanId}/drift |
Get drift results for a scan |
| GET | /drift/{driftId}/sinks |
List drifted sinks (paginated) |
| POST | /scans/{scanId}/compute-reachability |
Trigger reachability computation |
| GET | /scans/{scanId}/reachability/components |
List components with reachability |
| GET | /scans/{scanId}/reachability/findings |
Get reachable vulnerable sinks |
| GET | /scans/{scanId}/reachability/explain |
Explain why a sink is reachable |
See: docs/api/scanner-drift-api.md
8. Integration Points
8.1 Policy Module
Drift results feed into policy gates for CI/CD blocking:
smart_diff:
gates:
- condition: "delta_reachable > 0 AND is_kev = true"
action: block
8.2 VEX Emission
Automatic VEX candidate generation on drift:
| Drift Direction | VEX Status | Justification |
|---|---|---|
| became_unreachable | not_affected |
vulnerable_code_not_in_execute_path |
| became_reachable | — | Requires manual review |
8.3 Attestation
DSSE-signed drift attestations:
{
"_type": "https://in-toto.io/Statement/v1",
"predicateType": "stellaops.dev/predicates/reachability-drift@v1",
"predicate": {
"baseScanId": "abc123",
"headScanId": "def456",
"newlyReachable": [...],
"newlyUnreachable": [...],
"resultDigest": "sha256:..."
}
}
9. Performance Characteristics
| Metric | Target | Notes |
|---|---|---|
| Graph extraction (100K LOC) | < 60s | Per language |
| Reachability analysis | < 5s | BFS traversal |
| Drift detection | < 10s | Graph comparison |
| Memory usage | < 2GB | Large projects |
| Cache hit improvement | 10x | Valkey lookup vs recompute |
10. References
- Implementation Sprints:
docs/implplan/SPRINT_3600_0002_0001_call_graph_infrastructure.mddocs/implplan/SPRINT_3600_0003_0001_drift_detection_engine.md
- API Reference:
docs/api/scanner-drift-api.md - Operations Guide:
docs/operations/reachability-drift-guide.md - Original Advisory:
docs/product-advisories/archived/17-Dec-2025 - Reachability Drift Detection.md - Source Code:
src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/