save progress
This commit is contained in:
513
docs/flows/19-reachability-drift-alert-flow.md
Normal file
513
docs/flows/19-reachability-drift-alert-flow.md
Normal file
@@ -0,0 +1,513 @@
|
||||
# Reachability Drift Alert Flow
|
||||
|
||||
## Overview
|
||||
|
||||
The Reachability Drift Alert Flow describes how StellaOps detects and alerts on changes in code reachability that affect vulnerability risk assessments. When runtime observations or static analysis reveal that previously unreachable vulnerable code has become reachable (or vice versa), this flow triggers re-evaluation and notifications.
|
||||
|
||||
**Business Value**: Catch newly reachable vulnerabilities before they're exploited, and reduce alert fatigue by downgrading unreachable vulnerabilities automatically.
|
||||
|
||||
## Actors
|
||||
|
||||
| Actor | Type | Role |
|
||||
|-------|------|------|
|
||||
| Signals | Service | Collects runtime telemetry |
|
||||
| ReachGraph | Service | Analyzes reachability state |
|
||||
| Scanner | Service | Re-evaluates findings |
|
||||
| Policy Engine | Service | Re-evaluates verdicts |
|
||||
| Notify | Service | Sends drift alerts |
|
||||
| Scheduler | Service | Orchestrates periodic checks |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Runtime instrumentation deployed (eBPF agent or OpenTelemetry)
|
||||
- Baseline reachability analysis completed
|
||||
- Drift detection policies configured
|
||||
- Alert channels configured
|
||||
|
||||
## Reachability State Transitions
|
||||
|
||||
| From State | To State | Risk Impact | Alert Priority |
|
||||
|------------|----------|-------------|----------------|
|
||||
| Unknown → StaticallyReachable | Increased | Medium |
|
||||
| Unknown → RuntimeObserved | Increased | High |
|
||||
| StaticallyUnreachable → StaticallyReachable | Increased | Medium |
|
||||
| StaticallyReachable → RuntimeObserved | Confirmed | High |
|
||||
| RuntimeObserved → ConfirmedReachable | Confirmed | High |
|
||||
| StaticallyReachable → StaticallyUnreachable | Decreased | Low |
|
||||
| RuntimeObserved → RuntimeUnobserved | Decreased | Medium |
|
||||
| Any → ConfirmedUnreachable | Decreased | Low |
|
||||
| Any → Contested | Review needed | High |
|
||||
|
||||
## Flow Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Reachability Drift Alert Flow │
|
||||
└─────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────┐ ┌───────────┐ ┌───────────┐ ┌─────────┐ ┌────────┐ ┌────────┐
|
||||
│ Signals │ │ ReachGraph│ │ Scheduler │ │ Scanner │ │ Policy │ │ Notify │
|
||||
└────┬────┘ └─────┬─────┘ └─────┬─────┘ └────┬────┘ └───┬────┘ └───┬────┘
|
||||
│ │ │ │ │ │
|
||||
│ Runtime │ │ │ │ │
|
||||
│ event │ │ │ │ │
|
||||
│────────────>│ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Update │ │ │ │
|
||||
│ │ call graph │ │ │ │
|
||||
│ │───┐ │ │ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │<──┘ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Detect │ │ │ │
|
||||
│ │ state change │ │ │ │
|
||||
│ │───┐ │ │ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │<──┘ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ [If state │ │ │ │
|
||||
│ │ changed] │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Emit drift │ │ │ │
|
||||
│ │ event │ │ │ │
|
||||
│ │─────────────>│ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Queue │ │ │
|
||||
│ │ │ re-eval │ │ │
|
||||
│ │ │────────────>│ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Load scan │ │
|
||||
│ │ │ │ + findings│ │
|
||||
│ │ │ │───┐ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │<──┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Update │ │
|
||||
│ │ │ │ reach │ │
|
||||
│ │ │ │ states │ │
|
||||
│ │ │ │───┐ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │<──┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Re-eval │ │
|
||||
│ │ │ │──────────>│ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ │ Compare │
|
||||
│ │ │ │ │ verdicts │
|
||||
│ │ │ │ │───┐ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │ │<──┘ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ New │ │
|
||||
│ │ │ │ verdict │ │
|
||||
│ │ │ │<──────────│ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ [If verdict │ │ │
|
||||
│ │ │ changed] │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Alert │ │ │
|
||||
│ │ │─────────────────────────────────────>│
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ │ │ Send
|
||||
│ │ │ │ │ │ alert
|
||||
│ │ │ │ │ │───┐
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │ │ │<──┘
|
||||
│ │ │ │ │ │
|
||||
```
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
### 1. Runtime Event Collection
|
||||
|
||||
Signals service collects function invocation data:
|
||||
|
||||
```json
|
||||
{
|
||||
"event_type": "function_invocation",
|
||||
"timestamp": "2024-12-29T10:30:00Z",
|
||||
"source": "ebpf-agent",
|
||||
"payload": {
|
||||
"container_id": "abc123...",
|
||||
"image_digest": "sha256:...",
|
||||
"function": "lodash.template",
|
||||
"package": "pkg:npm/lodash@4.17.20",
|
||||
"call_stack": [
|
||||
"app/routes/render.js:45",
|
||||
"lib/template-engine.js:123",
|
||||
"node_modules/lodash/template.js:89"
|
||||
],
|
||||
"invocation_count": 1
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Reachability State Update
|
||||
|
||||
ReachGraph updates K4 lattice state:
|
||||
|
||||
```json
|
||||
{
|
||||
"state_transition": {
|
||||
"package": "pkg:npm/lodash@4.17.20",
|
||||
"function": "lodash.template",
|
||||
"image_digest": "sha256:...",
|
||||
"previous_state": "StaticallyReachable",
|
||||
"new_state": "RuntimeObserved",
|
||||
"transition_reason": "first_runtime_invocation",
|
||||
"evidence": {
|
||||
"static": {
|
||||
"call_paths": 3,
|
||||
"entry_points": ["app/routes/render.js:45"]
|
||||
},
|
||||
"runtime": {
|
||||
"first_observed": "2024-12-29T10:30:00Z",
|
||||
"invocation_count": 1,
|
||||
"call_stack_hash": "sha256:stackhash..."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Drift Detection
|
||||
|
||||
ReachGraph compares against baseline and detects drift:
|
||||
|
||||
```json
|
||||
{
|
||||
"drift_event": {
|
||||
"drift_id": "drift-123",
|
||||
"detected_at": "2024-12-29T10:30:01Z",
|
||||
"image_digest": "sha256:...",
|
||||
"package": "pkg:npm/lodash@4.17.20",
|
||||
"transition": {
|
||||
"from": "StaticallyReachable",
|
||||
"to": "RuntimeObserved"
|
||||
},
|
||||
"affected_vulnerabilities": [
|
||||
{
|
||||
"cve": "CVE-2024-1234",
|
||||
"severity": "critical",
|
||||
"previous_risk": "medium",
|
||||
"new_risk": "high"
|
||||
}
|
||||
],
|
||||
"risk_impact": "increased",
|
||||
"alert_priority": "high"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Re-evaluation Trigger
|
||||
|
||||
Scheduler queues affected scans for re-evaluation:
|
||||
|
||||
```json
|
||||
{
|
||||
"reevaluation_job": {
|
||||
"job_id": "reeval-456",
|
||||
"trigger": "reachability_drift",
|
||||
"drift_id": "drift-123",
|
||||
"scans": [
|
||||
{
|
||||
"scan_id": "scan-abc123",
|
||||
"image": "docker.io/myorg/app:v1.2.3",
|
||||
"affected_findings": ["CVE-2024-1234"]
|
||||
}
|
||||
],
|
||||
"priority": "high"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Finding Re-evaluation
|
||||
|
||||
Scanner updates findings with new reachability:
|
||||
|
||||
```json
|
||||
{
|
||||
"finding_update": {
|
||||
"scan_id": "scan-abc123",
|
||||
"cve": "CVE-2024-1234",
|
||||
"package": "pkg:npm/lodash@4.17.20",
|
||||
"previous": {
|
||||
"reachability": "StaticallyReachable",
|
||||
"confidence": 0.70
|
||||
},
|
||||
"updated": {
|
||||
"reachability": "RuntimeObserved",
|
||||
"confidence": 0.95,
|
||||
"evidence": {
|
||||
"runtime_observed_at": "2024-12-29T10:30:00Z",
|
||||
"call_stack": ["..."]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 6. Policy Re-evaluation
|
||||
|
||||
Policy engine re-evaluates with updated reachability:
|
||||
|
||||
```json
|
||||
{
|
||||
"verdict_comparison": {
|
||||
"scan_id": "scan-abc123",
|
||||
"previous_verdict": "WARN",
|
||||
"new_verdict": "FAIL",
|
||||
"verdict_changed": true,
|
||||
"changes": [
|
||||
{
|
||||
"finding": "CVE-2024-1234",
|
||||
"rule": "no-critical-reachable",
|
||||
"previous_result": "WARN (StaticallyReachable)",
|
||||
"new_result": "FAIL (RuntimeObserved)",
|
||||
"reason": "Runtime execution confirmed - elevated from warning to block"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7. Alert Generation
|
||||
|
||||
Notify sends drift alert:
|
||||
|
||||
```json
|
||||
{
|
||||
"alert": {
|
||||
"alert_id": "alert-789",
|
||||
"type": "reachability_drift",
|
||||
"priority": "high",
|
||||
"title": "Vulnerability Reachability Confirmed by Runtime",
|
||||
"body": {
|
||||
"summary": "CVE-2024-1234 in lodash@4.17.20 was observed at runtime",
|
||||
"image": "docker.io/myorg/app:v1.2.3",
|
||||
"cve": "CVE-2024-1234",
|
||||
"severity": "critical",
|
||||
"transition": "StaticallyReachable → RuntimeObserved",
|
||||
"impact": "Verdict changed from WARN to FAIL",
|
||||
"action_required": "Immediate remediation recommended",
|
||||
"remediation": "Upgrade lodash to 4.17.21"
|
||||
},
|
||||
"channels": ["slack", "pagerduty", "email"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Slack Alert Format
|
||||
|
||||
```
|
||||
🚨 Vulnerability Reachability Confirmed
|
||||
|
||||
CVE: CVE-2024-1234 (Critical)
|
||||
Package: lodash@4.17.20
|
||||
Image: myorg/app:v1.2.3
|
||||
|
||||
State Change: StaticallyReachable → RuntimeObserved
|
||||
Impact: Verdict changed WARN → FAIL
|
||||
|
||||
The vulnerable function `lodash.template` was invoked at runtime,
|
||||
confirming the vulnerability is exploitable.
|
||||
|
||||
Action: Immediate remediation required
|
||||
Fix: Upgrade lodash to 4.17.21
|
||||
|
||||
[View Details] [Create Ticket] [Add Exception]
|
||||
```
|
||||
|
||||
## Drift Detection Modes
|
||||
|
||||
### Real-time Detection
|
||||
|
||||
```yaml
|
||||
drift_detection:
|
||||
mode: realtime
|
||||
config:
|
||||
latency_target: 5s
|
||||
buffer_window: 0
|
||||
immediate_alert_severity: [critical, high]
|
||||
```
|
||||
|
||||
### Batch Detection
|
||||
|
||||
```yaml
|
||||
drift_detection:
|
||||
mode: batch
|
||||
config:
|
||||
check_interval: 15m
|
||||
aggregate_similar: true
|
||||
min_invocations_for_transition: 3
|
||||
```
|
||||
|
||||
### Hybrid Detection
|
||||
|
||||
```yaml
|
||||
drift_detection:
|
||||
mode: hybrid
|
||||
config:
|
||||
realtime_for: [critical]
|
||||
batch_for: [high, medium, low]
|
||||
batch_interval: 1h
|
||||
```
|
||||
|
||||
## Downgrade Handling
|
||||
|
||||
When reachability decreases (e.g., code removed):
|
||||
|
||||
```json
|
||||
{
|
||||
"downgrade_event": {
|
||||
"drift_id": "drift-456",
|
||||
"type": "risk_decrease",
|
||||
"package": "pkg:npm/lodash@4.17.20",
|
||||
"cve": "CVE-2024-1234",
|
||||
"transition": {
|
||||
"from": "RuntimeObserved",
|
||||
"to": "RuntimeUnobserved",
|
||||
"observation_gap": "30d"
|
||||
},
|
||||
"risk_impact": "decreased",
|
||||
"action": "auto_downgrade",
|
||||
"verdict_change": "FAIL → WARN",
|
||||
"notification": "info" // Lower priority for improvements
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Data Contracts
|
||||
|
||||
### Drift Event Schema
|
||||
|
||||
```typescript
|
||||
interface ReachabilityDriftEvent {
|
||||
drift_id: string;
|
||||
detected_at: string;
|
||||
image_digest: string;
|
||||
package: string;
|
||||
function?: string;
|
||||
transition: {
|
||||
from: K4State;
|
||||
to: K4State;
|
||||
reason: string;
|
||||
};
|
||||
affected_vulnerabilities: Array<{
|
||||
cve: string;
|
||||
severity: string;
|
||||
previous_risk: string;
|
||||
new_risk: string;
|
||||
}>;
|
||||
risk_impact: 'increased' | 'decreased' | 'unchanged';
|
||||
alert_priority: 'critical' | 'high' | 'medium' | 'low' | 'info';
|
||||
evidence: {
|
||||
static?: StaticEvidence;
|
||||
runtime?: RuntimeEvidence;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Drift Alert Schema
|
||||
|
||||
```typescript
|
||||
interface DriftAlert {
|
||||
alert_id: string;
|
||||
drift_id: string;
|
||||
type: 'reachability_drift';
|
||||
priority: 'critical' | 'high' | 'medium' | 'low';
|
||||
title: string;
|
||||
body: {
|
||||
summary: string;
|
||||
image: string;
|
||||
cve: string;
|
||||
severity: string;
|
||||
transition: string;
|
||||
impact: string;
|
||||
action_required?: string;
|
||||
remediation?: string;
|
||||
};
|
||||
channels: string[];
|
||||
sent_at: string;
|
||||
acknowledged_at?: string;
|
||||
resolved_at?: string;
|
||||
}
|
||||
```
|
||||
|
||||
## Drift Policies
|
||||
|
||||
### Aggressive (High Security)
|
||||
|
||||
```yaml
|
||||
drift_policy:
|
||||
mode: aggressive
|
||||
rules:
|
||||
- any_reachability_increase: alert_immediately
|
||||
- runtime_first_observation: alert_critical
|
||||
- contested_state: require_investigation
|
||||
- auto_downgrade: disabled
|
||||
```
|
||||
|
||||
### Balanced
|
||||
|
||||
```yaml
|
||||
drift_policy:
|
||||
mode: balanced
|
||||
rules:
|
||||
- critical_cve_reachability_increase: alert_high
|
||||
- high_cve_runtime_observation: alert_medium
|
||||
- contested_state: alert_medium
|
||||
- auto_downgrade:
|
||||
enabled: true
|
||||
observation_gap: 30d
|
||||
confidence_threshold: 0.9
|
||||
```
|
||||
|
||||
### Permissive
|
||||
|
||||
```yaml
|
||||
drift_policy:
|
||||
mode: permissive
|
||||
rules:
|
||||
- runtime_observation_critical: alert_high
|
||||
- other_increases: log_only
|
||||
- auto_downgrade:
|
||||
enabled: true
|
||||
observation_gap: 14d
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Recovery |
|
||||
|-------|----------|
|
||||
| Signal collection gap | Use last known state, note uncertainty |
|
||||
| State conflict | Mark as Contested, require review |
|
||||
| Alert delivery failure | Queue for retry |
|
||||
| Scan not found | Skip re-evaluation, log warning |
|
||||
|
||||
## Observability
|
||||
|
||||
### Metrics
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `reachability_drift_events_total` | Counter | `transition_type`, `risk_impact` |
|
||||
| `reachability_state_transitions_total` | Counter | `from_state`, `to_state` |
|
||||
| `drift_alert_sent_total` | Counter | `priority`, `channel` |
|
||||
| `drift_detection_latency_ms` | Histogram | - |
|
||||
|
||||
### Key Log Events
|
||||
|
||||
| Event | Level | Fields |
|
||||
|-------|-------|--------|
|
||||
| `reach.state_transition` | INFO | `package`, `from`, `to` |
|
||||
| `reach.drift_detected` | WARN | `drift_id`, `impact` |
|
||||
| `reach.verdict_changed` | WARN | `scan_id`, `previous`, `new` |
|
||||
| `reach.alert_sent` | INFO | `alert_id`, `priority` |
|
||||
|
||||
## Related Flows
|
||||
|
||||
- [Policy Evaluation Flow](04-policy-evaluation-flow.md) - K4 lattice details
|
||||
- [Advisory Drift Re-scan Flow](11-advisory-drift-rescan-flow.md) - Similar re-evaluation
|
||||
- [Risk Score Dashboard Flow](18-risk-score-dashboard-flow.md) - Risk impact
|
||||
- [Notification Flow](05-notification-flow.md) - Alert delivery
|
||||
Reference in New Issue
Block a user