SPRINT_3600_0001_0001 - Reachability Drift Detection Master Plan

This commit is contained in:
2025-12-18 00:02:31 +02:00
parent 8bbfe4d2d2
commit dee252940b
13 changed files with 6099 additions and 1651 deletions

View File

@@ -0,0 +1,395 @@
# Reachability Drift Detection
**Date**: 2025-12-17
**Status**: ANALYZED - Ready for Implementation Planning
**Related Advisories**:
- 14-Dec-2025 - Smart-Diff Technical Reference
- 14-Dec-2025 - Reachability Analysis Technical Reference
---
## 1. EXECUTIVE SUMMARY
This advisory proposes extending StellaOps' Smart-Diff capabilities to detect **reachability drift** - changes in whether vulnerable code paths are reachable from application entry points between container image versions.
**Core Insight**: Raw diffs don't equal risk. Most changed lines don't matter for exploitability. Reachability drift detection fuses **call-stack reachability graphs** with **Smart-Diff metadata** to flag only paths that went from **unreachable to reachable** (or vice-versa), tied to **SBOM components** and **VEX statements**.
---
## 2. GAP ANALYSIS vs EXISTING INFRASTRUCTURE
### 2.1 What Already Exists (Leverage Points)
| Component | Location | Status |
|-----------|----------|--------|
| `MaterialRiskChangeDetector` | `Scanner.SmartDiff.Detection` | DONE - R1-R4 rules |
| `VexCandidateEmitter` | `Scanner.SmartDiff.Detection` | DONE - Absent API detection |
| `ReachabilityGateBridge` | `Scanner.SmartDiff.Detection` | DONE - Lattice to 3-bit |
| `ReachabilitySignal` | `Signals.Contracts` | DONE - Call path model |
| `ReachabilityLatticeState` | `Signals.Contracts.Evidence` | DONE - 5-state enum |
| `CallPath`, `CallPathNode` | `Signals.Contracts.Evidence` | DONE - Path representation |
| `ReachabilityEvidenceChain` | `Signals.Contracts.Evidence` | DONE - Proof chain |
| `vex.graph_nodes/edges` | DB Schema | DONE - Graph storage |
| `scanner.risk_state_snapshots` | DB Schema | DONE - State storage |
| `scanner.material_risk_changes` | DB Schema | DONE - Change storage |
| `FnDriftCalculator` | `Scanner.Core.Drift` | DONE - Classification drift |
| `SarifOutputGenerator` | `Scanner.SmartDiff.Output` | DONE - CI output |
| Reachability Benchmark | `bench/reachability-benchmark/` | DONE - Ground truth cases |
| Language Analyzers | `Scanner.Analyzers.Lang.*` | PARTIAL - Package detection, limited call graph |
### 2.2 What's Missing (New Implementation Required)
| Component | Advisory Ref | Gap Description |
|-----------|-------------|-----------------|
| **Call Graph Extractor (.NET)** | §7 C# Roslyn | No MSBuildWorkspace/Roslyn analysis exists |
| **Call Graph Extractor (Go)** | §7 Go SSA | No golang.org/x/tools/go/ssa integration |
| **Call Graph Extractor (Java)** | §7 | No Soot/WALA integration |
| **Call Graph Extractor (Node)** | §7 | No @babel/traverse integration |
| **`scanner.code_changes` table** | §4 Smart-Diff | AST-level diff facts not persisted |
| **Drift Cause Explainer** | §6 Timeline | No causal attribution on path nodes |
| **Path Viewer UI** | §UX | No Angular component for call path visualization |
| **Cross-scan Function-level Drift** | §6 | State drift exists, function-level doesn't |
| **Entrypoint Discovery (per-framework)** | §3 | Limited beyond package.json/manifest parsing |
### 2.3 Terminology Mapping
| Advisory Term | StellaOps Equivalent | Notes |
|--------------|---------------------|-------|
| `commit_sha` | `scan_id` | StellaOps is image-centric, not commit-centric |
| `call_node` | `vex.graph_nodes` | Existing schema, extend don't duplicate |
| `call_edge` | `vex.graph_edges` | Existing schema |
| `reachability_drift` | `scanner.material_risk_changes` | Add `cause`, `path_nodes` columns |
| Risk Drift | Material Risk Change | Existing term is more precise |
| Router, Signals | Signals module only | Router module is not implemented |
---
## 3. RECOMMENDED IMPLEMENTATION PATH
### 3.1 What to Ship (Delta from Current State)
```
NEW TABLES:
├── scanner.code_changes # AST-level diff facts
└── scanner.call_graph_snapshots # Per-scan call graph cache
NEW COLUMNS:
├── scanner.material_risk_changes.cause # TEXT - "guard_removed", "new_route", etc.
├── scanner.material_risk_changes.path_nodes # JSONB - Compressed path representation
└── scanner.material_risk_changes.base_scan_id # UUID - For cross-scan comparison
NEW SERVICES:
├── CallGraphExtractor.DotNet # Roslyn-based for .NET projects
├── CallGraphExtractor.Node # AST-based for Node.js
├── DriftCauseExplainer # Attribute causes to code changes
└── PathCompressor # Compress paths for storage/UI
NEW UI:
└── PathViewerComponent # Angular component for call path visualization
```
### 3.2 What NOT to Ship (Avoid Duplication)
- **Don't create `call_node`/`call_edge` tables** - Use existing `vex.graph_nodes`/`vex.graph_edges`
- **Don't add `commit_sha` columns** - Use `scan_id` consistently
- **Don't build React components** - Angular v17 is the stack
### 3.3 Use Valkey for Graph Caching
Valkey is already integrated in `Router.Gateway.RateLimit`. Use it for:
- **Call graph snapshot caching** - Fast cross-instance lookups
- **Reachability result caching** - Avoid recomputation
- **Key pattern**: `stella:callgraph:{scan_id}:{lang}:{digest}`
```yaml
# Configuration pattern (align with existing Router rate limiting)
reachability:
valkey_connection: "localhost:6379"
valkey_bucket: "stella-reachability"
cache_ttl_hours: 24
circuit_breaker:
failure_threshold: 5
timeout_seconds: 30
```
---
## 4. TECHNICAL DESIGN
### 4.1 Call Graph Extraction Model
```csharp
/// <summary>
/// Per-scan call graph snapshot for drift comparison.
/// </summary>
public sealed record CallGraphSnapshot
{
public required string ScanId { get; init; }
public required string GraphDigest { get; init; } // Content hash
public required string Language { get; init; }
public required DateTimeOffset ExtractedAt { get; init; }
public required ImmutableArray<CallGraphNode> Nodes { get; init; }
public required ImmutableArray<CallGraphEdge> Edges { get; init; }
public required ImmutableArray<string> EntrypointIds { get; init; }
}
public sealed record CallGraphNode
{
public required string NodeId { get; init; } // Stable identifier
public required string Symbol { get; init; } // Fully qualified name
public required string File { get; init; }
public required int Line { get; init; }
public required string Package { get; init; }
public required string Visibility { get; init; } // public/internal/private
public required bool IsEntrypoint { get; init; }
public required bool IsSink { get; init; }
public string? SinkCategory { get; init; } // CMD_EXEC, SQL_RAW, etc.
}
public sealed record CallGraphEdge
{
public required string SourceId { get; init; }
public required string TargetId { get; init; }
public required string CallKind { get; init; } // direct/virtual/delegate
}
```
### 4.2 Code Change Facts Model
```csharp
/// <summary>
/// AST-level code change facts from Smart-Diff.
/// </summary>
public sealed record CodeChangeFact
{
public required string ScanId { get; init; }
public required string File { get; init; }
public required string Symbol { get; init; }
public required CodeChangeKind Kind { get; init; }
public required JsonDocument Details { get; init; }
}
public enum CodeChangeKind
{
Added,
Removed,
SignatureChanged,
GuardChanged, // Boolean condition around call modified
DependencyChanged, // Callee package/version changed
VisibilityChanged // public<->internal<->private
}
```
### 4.3 Drift Cause Attribution
```csharp
/// <summary>
/// Explains why a reachability flip occurred.
/// </summary>
public sealed class DriftCauseExplainer
{
public DriftCause Explain(
CallGraphSnapshot baseGraph,
CallGraphSnapshot headGraph,
string sinkSymbol,
IReadOnlyList<CodeChangeFact> codeChanges)
{
// Find shortest path to sink in head graph
var path = ShortestPath(headGraph.EntrypointIds, sinkSymbol, headGraph);
if (path is null)
return DriftCause.Unknown;
// Check each node on path for code changes
foreach (var nodeId in path.NodeIds)
{
var node = headGraph.Nodes.First(n => n.NodeId == nodeId);
var change = codeChanges.FirstOrDefault(c => c.Symbol == node.Symbol);
if (change is not null)
{
return change.Kind switch
{
CodeChangeKind.GuardChanged => DriftCause.GuardRemoved(node.Symbol, node.File, node.Line),
CodeChangeKind.Added => DriftCause.NewPublicRoute(node.Symbol),
CodeChangeKind.VisibilityChanged => DriftCause.VisibilityEscalated(node.Symbol),
CodeChangeKind.DependencyChanged => DriftCause.DepUpgraded(change.Details),
_ => DriftCause.CodeModified(node.Symbol)
};
}
}
return DriftCause.Unknown;
}
}
```
### 4.4 Database Schema Extensions
```sql
-- New table: Code change facts from AST-level Smart-Diff
CREATE TABLE scanner.code_changes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
scan_id TEXT NOT NULL,
file TEXT NOT NULL,
symbol TEXT NOT NULL,
change_kind TEXT NOT NULL, -- added|removed|signature|guard|dep|visibility
details JSONB,
detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT code_changes_unique UNIQUE (tenant_id, scan_id, file, symbol)
);
CREATE INDEX idx_code_changes_scan ON scanner.code_changes(scan_id);
CREATE INDEX idx_code_changes_symbol ON scanner.code_changes(symbol);
-- New table: Per-scan call graph snapshots (compressed)
CREATE TABLE scanner.call_graph_snapshots (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
scan_id TEXT NOT NULL,
language TEXT NOT NULL,
graph_digest TEXT NOT NULL, -- Content hash for dedup
node_count INT NOT NULL,
edge_count INT NOT NULL,
entrypoint_count INT NOT NULL,
extracted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
cas_uri TEXT NOT NULL, -- Reference to CAS for full graph
CONSTRAINT call_graph_snapshots_unique UNIQUE (tenant_id, scan_id, language)
);
CREATE INDEX idx_call_graph_snapshots_digest ON scanner.call_graph_snapshots(graph_digest);
-- Extend existing material_risk_changes table
ALTER TABLE scanner.material_risk_changes
ADD COLUMN IF NOT EXISTS cause TEXT,
ADD COLUMN IF NOT EXISTS path_nodes JSONB,
ADD COLUMN IF NOT EXISTS base_scan_id TEXT;
CREATE INDEX IF NOT EXISTS idx_material_risk_changes_cause
ON scanner.material_risk_changes(cause) WHERE cause IS NOT NULL;
```
---
## 5. UI DESIGN
### 5.1 Risk Drift Card (PR/Commit View)
```
┌─────────────────────────────────────────────────────────────────────┐
│ RISK DRIFT ▼ │
├─────────────────────────────────────────────────────────────────────┤
│ +3 new reachable paths -2 mitigated paths │
│ │
│ ┌─ NEW REACHABLE ──────────────────────────────────────────────┐ │
│ │ POST /payments → PaymentsController.Capture → ... → │ │
│ │ crypto.Verify(legacy) │ │
│ │ │ │
│ │ [pkg:payments@1.8.2] [CVE-2024-1234] [EPSS 0.72] [VEX:affected]│ │
│ │ │ │
│ │ Cause: guard removed in AuthFilter.cs:42 │ │
│ │ │ │
│ │ [View Path] [Quarantine Route] [Pin Version] [Add Exception] │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ MITIGATED ──────────────────────────────────────────────────┐ │
│ │ GET /admin → AdminController.Execute → ... → cmd.Run │ │
│ │ │ │
│ │ [pkg:admin@2.0.0] [CVE-2024-5678] [VEX:not_affected] │ │
│ │ │ │
│ │ Reason: Vulnerable API removed in upgrade │ │
│ └───────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
### 5.2 Path Viewer Component
```
┌─────────────────────────────────────────────────────────────────────┐
│ CALL PATH: POST /payments → crypto.Verify(legacy) [Collapse] │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ○ POST /payments [ENTRYPOINT] │
│ │ PaymentsController.cs:45 │
│ │ │
│ ├──○ PaymentsController.Capture() │
│ │ │ PaymentsController.cs:89 │
│ │ │ │
│ │ ├──○ PaymentService.ProcessPayment() │
│ │ │ │ PaymentService.cs:156 │
│ │ │ │ │
│ │ │ ├──● CryptoHelper.Verify() ← GUARD REMOVED │
│ │ │ │ │ CryptoHelper.cs:42 [Changed: AuthFilter removed] │
│ │ │ │ │ │
│ │ │ │ └──◆ crypto.Verify(legacy) [VULNERABLE SINK] │
│ │ │ │ pkg:crypto@1.2.3 │
│ │ │ │ CVE-2024-1234 (CVSS 9.8) │
│ │
│ Legend: ○ Node ● Changed ◆ Sink ─ Call │
└─────────────────────────────────────────────────────────────────────┘
```
---
## 6. POLICY INTEGRATION
### 6.1 CI Gate Behavior
```yaml
# Policy wiring for drift detection
smart_diff:
gates:
# Fail PR when new reachable paths to affected sinks
- condition: "delta_reachable > 0 AND vex_status IN ['affected', 'under_investigation']"
action: block
message: "New reachable paths to vulnerable sinks detected"
# Warn when new paths to any sink
- condition: "delta_reachable > 0"
action: warn
message: "New reachable paths detected - review recommended"
# Auto-mitigate when VEX confirms not_affected
- condition: "vex_status == 'not_affected' AND vex_justification IN ['component_not_present', 'fix_applied']"
action: allow
auto_mitigate: true
```
### 6.2 Exit Codes
| Code | Meaning |
|------|---------|
| 0 | Success, no material drift |
| 1 | Success, material drift found (info) |
| 2 | Success, hardening regression detected |
| 3 | Success, new KEV reachable |
| 10+ | Errors |
---
## 7. SPRINT STRUCTURE
### 7.1 Master Sprint: SPRINT_3600_0001_0001
**Topic**: Reachability Drift Detection
**Dependencies**: SPRINT_3500 (Smart-Diff) - COMPLETE
### 7.2 Sub-Sprints
| ID | Topic | Priority | Effort | Dependencies |
|----|-------|----------|--------|--------------|
| SPRINT_3600_0002_0001 | Call Graph Infrastructure | P0 | Large | Master |
| SPRINT_3600_0003_0001 | Drift Detection Engine | P0 | Medium | 3600.2 |
| SPRINT_3600_0004_0001 | UI and Evidence Chain | P1 | Medium | 3600.3 |
---
## 8. REFERENCES
- `docs/product-advisories/14-Dec-2025 - Smart-Diff Technical Reference.md`
- `docs/product-advisories/14-Dec-2025 - Reachability Analysis Technical Reference.md`
- `docs/implplan/SPRINT_3500_0001_0001_smart_diff_master.md`
- `docs/reachability/lattice.md`
- `bench/reachability-benchmark/README.md`