Files
git.stella-ops.org/docs/modules/scanner/reachability-drift.md
StellaOps Bot 634233dfed feat: Implement distro-native version comparison for RPM, Debian, and Alpine packages
- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling.
- Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings.
- Create ApkVersion for parsing Alpine APK version strings with suffix support.
- Define IVersionComparator interface for version comparison with proof-line generation.
- Implement VersionComparisonResult struct to encapsulate comparison results and proof lines.
- Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling.
- Create project files for the version comparison library and its tests.
2025-12-22 09:50:12 +02:00

10 KiB

Reachability Drift Detection - Architecture

Module: Scanner Version: 1.0 Status: Implemented (Sprint 3600.2-3600.3) Last Updated: 2025-12-22


1. Overview

Reachability Drift Detection tracks function-level reachability changes between scans to identify when code modifications create new paths to vulnerable sinks or mitigate existing risks. This enables security teams to:

  • Detect regressions when previously unreachable vulnerabilities become exploitable
  • Validate fixes by confirming vulnerable code paths are removed
  • Prioritize triage based on actual exploitability rather than theoretical risk
  • Automate VEX by generating evidence-backed justifications

2. Key Concepts

2.1 Call Graph

A directed graph representing function/method call relationships in source code:

  • Nodes: Functions, methods, lambdas with metadata (file, line, visibility)
  • Edges: Call relationships with call kind (direct, virtual, delegate, reflection, dynamic)
  • Entrypoints: Public-facing functions (HTTP handlers, CLI commands, message consumers)
  • Sinks: Security-sensitive APIs (command execution, SQL, file I/O, deserialization)

2.2 Reachability Analysis

Multi-source BFS traversal from entrypoints to determine which sinks are exploitable:

Entrypoints (HTTP handlers, CLI)
        │
        ▼ BFS traversal
    [Application Code]
        │
        ▼
    Sinks (exec, query, writeFile)
        │
        ▼
    Reachable = TRUE if path exists

2.3 Drift Detection

Compares reachability between two scans (base vs head):

Transition Direction Risk Impact
Unreachable → Reachable became_reachable Increased - New exploit path
Reachable → Unreachable became_unreachable Decreased - Mitigation applied

2.4 Cause Attribution

Explains why drift occurred by correlating with code changes:

Cause Kind Description Example
guard_removed Conditional check removed if (!authorized) deleted
guard_added New conditional blocks path Added null check
new_public_route New entrypoint created Added /api/admin endpoint
visibility_escalated Internal → Public Method made public
dependency_upgraded Library update changed behavior lodash 4.x → 5.x
symbol_removed Function deleted Removed vulnerable helper
unknown Cannot determine Multiple simultaneous changes

3. Data Flow

flowchart TD
    subgraph Scan["Scan Execution"]
        A[Source Code] --> B[Call Graph Extractor]
        B --> C[CallGraphSnapshot]
    end

    subgraph Analysis["Drift Analysis"]
        C --> D[Reachability Analyzer]
        D --> E[ReachabilityResult]

        F[Base Scan Graph] --> G[Drift Detector]
        E --> G
        H[Code Changes] --> G
        G --> I[ReachabilityDriftResult]
    end

    subgraph Output["Output"]
        I --> J[Path Compressor]
        J --> K[Compressed Paths]
        I --> L[Cause Explainer]
        L --> M[Drift Causes]

        K --> N[Storage/API]
        M --> N
    end

    subgraph Integration["Integration"]
        N --> O[Policy Gates]
        N --> P[VEX Emission]
        N --> Q[Web UI]
    end

4. Component Architecture

4.1 Call Graph Extractors

Per-language AST analysis producing CallGraphSnapshot:

Language Extractor Technology Status
.NET DotNetCallGraphExtractor Roslyn semantic model Done
Java JavaCallGraphExtractor ASM bytecode analysis Done
Go GoCallGraphExtractor golang.org/x/tools SSA Done
Python PythonCallGraphExtractor Python AST Done
Node.js NodeCallGraphExtractor Babel (planned) Skeleton
PHP PhpCallGraphExtractor php-parser Done
Ruby RubyCallGraphExtractor parser gem Done

Location: src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Extraction/

4.2 Reachability Analyzer

Multi-source BFS from entrypoints to sinks:

public sealed class ReachabilityAnalyzer
{
    public ReachabilityResult Analyze(CallGraphSnapshot graph);
}

public record ReachabilityResult
{
    ImmutableHashSet<string> ReachableNodes { get; }
    ImmutableArray<string> ReachableSinks { get; }
    ImmutableDictionary<string, ImmutableArray<string>> ShortestPaths { get; }
}

Location: src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/Analysis/

4.3 Drift Detector

Compares base and head graphs:

public sealed class ReachabilityDriftDetector
{
    public ReachabilityDriftResult Detect(
        CallGraphSnapshot baseGraph,
        CallGraphSnapshot headGraph,
        IReadOnlyList<CodeChangeFact> codeChanges);
}

Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/

4.4 Path Compressor

Reduces full paths to key nodes for storage/display:

Full Path (20 nodes):
  entrypoint → A → B → C → ... → X → Y → sink

Compressed Path:
  entrypoint → [changed: B] → [changed: X] → sink
  (intermediateCount: 17)

Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/PathCompressor.cs

4.5 Cause Explainer

Correlates drift with code changes:

public sealed class DriftCauseExplainer
{
    public DriftCause Explain(...);
    public DriftCause ExplainUnreachable(...);
}

Location: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/Services/DriftCauseExplainer.cs


5. Language Support Matrix

Feature .NET Java Go Python Node.js PHP Ruby
Function extraction Yes Yes Yes Yes Partial Yes Yes
Call edge extraction Yes Yes Yes Yes Partial Yes Yes
HTTP entrypoints ASP.NET Spring net/http Flask/Django Express* Laravel Rails
gRPC entrypoints Yes Yes Yes Yes No No No
CLI entrypoints Yes Yes Yes Yes Partial Yes Yes
Sink detection Yes Yes Yes Yes Partial Yes Yes

*Requires Sprint 3600.4 completion


6. Storage Schema

6.1 PostgreSQL Tables

call_graph_snapshots:

CREATE TABLE call_graph_snapshots (
    id UUID PRIMARY KEY,
    tenant_id UUID NOT NULL,
    scan_id TEXT NOT NULL,
    language TEXT NOT NULL,
    graph_digest TEXT NOT NULL,
    node_count INT NOT NULL,
    edge_count INT NOT NULL,
    entrypoint_count INT NOT NULL,
    sink_count INT NOT NULL,
    extracted_at TIMESTAMPTZ NOT NULL,
    snapshot_json JSONB NOT NULL
);

reachability_drift_results:

CREATE TABLE reachability_drift_results (
    id UUID PRIMARY KEY,
    tenant_id UUID NOT NULL,
    base_scan_id TEXT NOT NULL,
    head_scan_id TEXT NOT NULL,
    language TEXT NOT NULL,
    newly_reachable_count INT NOT NULL,
    newly_unreachable_count INT NOT NULL,
    detected_at TIMESTAMPTZ NOT NULL,
    result_digest TEXT NOT NULL
);

drifted_sinks:

CREATE TABLE drifted_sinks (
    id UUID PRIMARY KEY,
    tenant_id UUID NOT NULL,
    drift_result_id UUID NOT NULL REFERENCES reachability_drift_results(id),
    sink_node_id TEXT NOT NULL,
    symbol TEXT NOT NULL,
    sink_category TEXT NOT NULL,
    direction TEXT NOT NULL,
    cause_kind TEXT NOT NULL,
    cause_description TEXT NOT NULL,
    compressed_path JSONB NOT NULL,
    associated_vulns JSONB
);

code_changes:

CREATE TABLE code_changes (
    id UUID PRIMARY KEY,
    tenant_id UUID NOT NULL,
    scan_id TEXT NOT NULL,
    base_scan_id TEXT NOT NULL,
    language TEXT NOT NULL,
    file TEXT NOT NULL,
    symbol TEXT NOT NULL,
    change_kind TEXT NOT NULL,
    details JSONB,
    detected_at TIMESTAMPTZ NOT NULL
);

6.2 Valkey Caching

stella:callgraph:{scan_id}:{lang}:{digest}     → Compressed CallGraphSnapshot
stella:callgraph:{scan_id}:{lang}:reachable    → Set of reachable sink IDs
stella:callgraph:{scan_id}:{lang}:paths:{sink} → Shortest path to sink

TTL: Configurable (default 24h) Circuit breaker: 5 failures → 30s timeout


7. API Endpoints

Method Path Description
GET /scans/{scanId}/drift Get drift results for a scan
GET /drift/{driftId}/sinks List drifted sinks (paginated)
POST /scans/{scanId}/compute-reachability Trigger reachability computation
GET /scans/{scanId}/reachability/components List components with reachability
GET /scans/{scanId}/reachability/findings Get reachable vulnerable sinks
GET /scans/{scanId}/reachability/explain Explain why a sink is reachable

See: docs/api/scanner-drift-api.md


8. Integration Points

8.1 Policy Module

Drift results feed into policy gates for CI/CD blocking:

smart_diff:
  gates:
    - condition: "delta_reachable > 0 AND is_kev = true"
      action: block

8.2 VEX Emission

Automatic VEX candidate generation on drift:

Drift Direction VEX Status Justification
became_unreachable not_affected vulnerable_code_not_in_execute_path
became_reachable Requires manual review

8.3 Attestation

DSSE-signed drift attestations:

{
  "_type": "https://in-toto.io/Statement/v1",
  "predicateType": "stellaops.dev/predicates/reachability-drift@v1",
  "predicate": {
    "baseScanId": "abc123",
    "headScanId": "def456",
    "newlyReachable": [...],
    "newlyUnreachable": [...],
    "resultDigest": "sha256:..."
  }
}

9. Performance Characteristics

Metric Target Notes
Graph extraction (100K LOC) < 60s Per language
Reachability analysis < 5s BFS traversal
Drift detection < 10s Graph comparison
Memory usage < 2GB Large projects
Cache hit improvement 10x Valkey lookup vs recompute

10. References

  • Implementation Sprints:
    • docs/implplan/SPRINT_3600_0002_0001_call_graph_infrastructure.md
    • docs/implplan/SPRINT_3600_0003_0001_drift_detection_engine.md
  • API Reference: docs/api/scanner-drift-api.md
  • Operations Guide: docs/operations/reachability-drift-guide.md
  • Original Advisory: docs/product-advisories/archived/17-Dec-2025 - Reachability Drift Detection.md
  • Source Code: src/Scanner/__Libraries/StellaOps.Scanner.ReachabilityDrift/