save progress

This commit is contained in:
StellaOps Bot
2026-01-03 00:47:24 +02:00
parent 3f197814c5
commit ca578801fd
319 changed files with 32478 additions and 2202 deletions

View File

@@ -0,0 +1,513 @@
# Reachability Drift Alert Flow
## Overview
The Reachability Drift Alert Flow describes how StellaOps detects and alerts on changes in code reachability that affect vulnerability risk assessments. When runtime observations or static analysis reveal that previously unreachable vulnerable code has become reachable (or vice versa), this flow triggers re-evaluation and notifications.
**Business Value**: Catch newly reachable vulnerabilities before they're exploited, and reduce alert fatigue by downgrading unreachable vulnerabilities automatically.
## Actors
| Actor | Type | Role |
|-------|------|------|
| Signals | Service | Collects runtime telemetry |
| ReachGraph | Service | Analyzes reachability state |
| Scanner | Service | Re-evaluates findings |
| Policy Engine | Service | Re-evaluates verdicts |
| Notify | Service | Sends drift alerts |
| Scheduler | Service | Orchestrates periodic checks |
## Prerequisites
- Runtime instrumentation deployed (eBPF agent or OpenTelemetry)
- Baseline reachability analysis completed
- Drift detection policies configured
- Alert channels configured
## Reachability State Transitions
| From State | To State | Risk Impact | Alert Priority |
|------------|----------|-------------|----------------|
| Unknown → StaticallyReachable | Increased | Medium |
| Unknown → RuntimeObserved | Increased | High |
| StaticallyUnreachable → StaticallyReachable | Increased | Medium |
| StaticallyReachable → RuntimeObserved | Confirmed | High |
| RuntimeObserved → ConfirmedReachable | Confirmed | High |
| StaticallyReachable → StaticallyUnreachable | Decreased | Low |
| RuntimeObserved → RuntimeUnobserved | Decreased | Medium |
| Any → ConfirmedUnreachable | Decreased | Low |
| Any → Contested | Review needed | High |
## Flow Diagram
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Reachability Drift Alert Flow │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────┐ ┌───────────┐ ┌───────────┐ ┌─────────┐ ┌────────┐ ┌────────┐
│ Signals │ │ ReachGraph│ │ Scheduler │ │ Scanner │ │ Policy │ │ Notify │
└────┬────┘ └─────┬─────┘ └─────┬─────┘ └────┬────┘ └───┬────┘ └───┬────┘
│ │ │ │ │ │
│ Runtime │ │ │ │ │
│ event │ │ │ │ │
│────────────>│ │ │ │ │
│ │ │ │ │ │
│ │ Update │ │ │ │
│ │ call graph │ │ │ │
│ │───┐ │ │ │ │
│ │ │ │ │ │ │
│ │<──┘ │ │ │ │
│ │ │ │ │ │
│ │ Detect │ │ │ │
│ │ state change │ │ │ │
│ │───┐ │ │ │ │
│ │ │ │ │ │ │
│ │<──┘ │ │ │ │
│ │ │ │ │ │
│ │ [If state │ │ │ │
│ │ changed] │ │ │ │
│ │ │ │ │ │
│ │ Emit drift │ │ │ │
│ │ event │ │ │ │
│ │─────────────>│ │ │ │
│ │ │ │ │ │
│ │ │ Queue │ │ │
│ │ │ re-eval │ │ │
│ │ │────────────>│ │ │
│ │ │ │ │ │
│ │ │ │ Load scan │ │
│ │ │ │ + findings│ │
│ │ │ │───┐ │ │
│ │ │ │ │ │ │
│ │ │ │<──┘ │ │
│ │ │ │ │ │
│ │ │ │ Update │ │
│ │ │ │ reach │ │
│ │ │ │ states │ │
│ │ │ │───┐ │ │
│ │ │ │ │ │ │
│ │ │ │<──┘ │ │
│ │ │ │ │ │
│ │ │ │ Re-eval │ │
│ │ │ │──────────>│ │
│ │ │ │ │ │
│ │ │ │ │ Compare │
│ │ │ │ │ verdicts │
│ │ │ │ │───┐ │
│ │ │ │ │ │ │
│ │ │ │ │<──┘ │
│ │ │ │ │ │
│ │ │ │ New │ │
│ │ │ │ verdict │ │
│ │ │ │<──────────│ │
│ │ │ │ │ │
│ │ │ [If verdict │ │ │
│ │ │ changed] │ │ │
│ │ │ │ │ │
│ │ │ Alert │ │ │
│ │ │─────────────────────────────────────>│
│ │ │ │ │ │
│ │ │ │ │ │ Send
│ │ │ │ │ │ alert
│ │ │ │ │ │───┐
│ │ │ │ │ │ │
│ │ │ │ │ │<──┘
│ │ │ │ │ │
```
## Step-by-Step
### 1. Runtime Event Collection
Signals service collects function invocation data:
```json
{
"event_type": "function_invocation",
"timestamp": "2024-12-29T10:30:00Z",
"source": "ebpf-agent",
"payload": {
"container_id": "abc123...",
"image_digest": "sha256:...",
"function": "lodash.template",
"package": "pkg:npm/lodash@4.17.20",
"call_stack": [
"app/routes/render.js:45",
"lib/template-engine.js:123",
"node_modules/lodash/template.js:89"
],
"invocation_count": 1
}
}
```
### 2. Reachability State Update
ReachGraph updates K4 lattice state:
```json
{
"state_transition": {
"package": "pkg:npm/lodash@4.17.20",
"function": "lodash.template",
"image_digest": "sha256:...",
"previous_state": "StaticallyReachable",
"new_state": "RuntimeObserved",
"transition_reason": "first_runtime_invocation",
"evidence": {
"static": {
"call_paths": 3,
"entry_points": ["app/routes/render.js:45"]
},
"runtime": {
"first_observed": "2024-12-29T10:30:00Z",
"invocation_count": 1,
"call_stack_hash": "sha256:stackhash..."
}
}
}
}
```
### 3. Drift Detection
ReachGraph compares against baseline and detects drift:
```json
{
"drift_event": {
"drift_id": "drift-123",
"detected_at": "2024-12-29T10:30:01Z",
"image_digest": "sha256:...",
"package": "pkg:npm/lodash@4.17.20",
"transition": {
"from": "StaticallyReachable",
"to": "RuntimeObserved"
},
"affected_vulnerabilities": [
{
"cve": "CVE-2024-1234",
"severity": "critical",
"previous_risk": "medium",
"new_risk": "high"
}
],
"risk_impact": "increased",
"alert_priority": "high"
}
}
```
### 4. Re-evaluation Trigger
Scheduler queues affected scans for re-evaluation:
```json
{
"reevaluation_job": {
"job_id": "reeval-456",
"trigger": "reachability_drift",
"drift_id": "drift-123",
"scans": [
{
"scan_id": "scan-abc123",
"image": "docker.io/myorg/app:v1.2.3",
"affected_findings": ["CVE-2024-1234"]
}
],
"priority": "high"
}
}
```
### 5. Finding Re-evaluation
Scanner updates findings with new reachability:
```json
{
"finding_update": {
"scan_id": "scan-abc123",
"cve": "CVE-2024-1234",
"package": "pkg:npm/lodash@4.17.20",
"previous": {
"reachability": "StaticallyReachable",
"confidence": 0.70
},
"updated": {
"reachability": "RuntimeObserved",
"confidence": 0.95,
"evidence": {
"runtime_observed_at": "2024-12-29T10:30:00Z",
"call_stack": ["..."]
}
}
}
}
```
### 6. Policy Re-evaluation
Policy engine re-evaluates with updated reachability:
```json
{
"verdict_comparison": {
"scan_id": "scan-abc123",
"previous_verdict": "WARN",
"new_verdict": "FAIL",
"verdict_changed": true,
"changes": [
{
"finding": "CVE-2024-1234",
"rule": "no-critical-reachable",
"previous_result": "WARN (StaticallyReachable)",
"new_result": "FAIL (RuntimeObserved)",
"reason": "Runtime execution confirmed - elevated from warning to block"
}
]
}
}
```
### 7. Alert Generation
Notify sends drift alert:
```json
{
"alert": {
"alert_id": "alert-789",
"type": "reachability_drift",
"priority": "high",
"title": "Vulnerability Reachability Confirmed by Runtime",
"body": {
"summary": "CVE-2024-1234 in lodash@4.17.20 was observed at runtime",
"image": "docker.io/myorg/app:v1.2.3",
"cve": "CVE-2024-1234",
"severity": "critical",
"transition": "StaticallyReachable → RuntimeObserved",
"impact": "Verdict changed from WARN to FAIL",
"action_required": "Immediate remediation recommended",
"remediation": "Upgrade lodash to 4.17.21"
},
"channels": ["slack", "pagerduty", "email"]
}
}
```
### Slack Alert Format
```
🚨 Vulnerability Reachability Confirmed
CVE: CVE-2024-1234 (Critical)
Package: lodash@4.17.20
Image: myorg/app:v1.2.3
State Change: StaticallyReachable → RuntimeObserved
Impact: Verdict changed WARN → FAIL
The vulnerable function `lodash.template` was invoked at runtime,
confirming the vulnerability is exploitable.
Action: Immediate remediation required
Fix: Upgrade lodash to 4.17.21
[View Details] [Create Ticket] [Add Exception]
```
## Drift Detection Modes
### Real-time Detection
```yaml
drift_detection:
mode: realtime
config:
latency_target: 5s
buffer_window: 0
immediate_alert_severity: [critical, high]
```
### Batch Detection
```yaml
drift_detection:
mode: batch
config:
check_interval: 15m
aggregate_similar: true
min_invocations_for_transition: 3
```
### Hybrid Detection
```yaml
drift_detection:
mode: hybrid
config:
realtime_for: [critical]
batch_for: [high, medium, low]
batch_interval: 1h
```
## Downgrade Handling
When reachability decreases (e.g., code removed):
```json
{
"downgrade_event": {
"drift_id": "drift-456",
"type": "risk_decrease",
"package": "pkg:npm/lodash@4.17.20",
"cve": "CVE-2024-1234",
"transition": {
"from": "RuntimeObserved",
"to": "RuntimeUnobserved",
"observation_gap": "30d"
},
"risk_impact": "decreased",
"action": "auto_downgrade",
"verdict_change": "FAIL → WARN",
"notification": "info" // Lower priority for improvements
}
}
```
## Data Contracts
### Drift Event Schema
```typescript
interface ReachabilityDriftEvent {
drift_id: string;
detected_at: string;
image_digest: string;
package: string;
function?: string;
transition: {
from: K4State;
to: K4State;
reason: string;
};
affected_vulnerabilities: Array<{
cve: string;
severity: string;
previous_risk: string;
new_risk: string;
}>;
risk_impact: 'increased' | 'decreased' | 'unchanged';
alert_priority: 'critical' | 'high' | 'medium' | 'low' | 'info';
evidence: {
static?: StaticEvidence;
runtime?: RuntimeEvidence;
};
}
```
### Drift Alert Schema
```typescript
interface DriftAlert {
alert_id: string;
drift_id: string;
type: 'reachability_drift';
priority: 'critical' | 'high' | 'medium' | 'low';
title: string;
body: {
summary: string;
image: string;
cve: string;
severity: string;
transition: string;
impact: string;
action_required?: string;
remediation?: string;
};
channels: string[];
sent_at: string;
acknowledged_at?: string;
resolved_at?: string;
}
```
## Drift Policies
### Aggressive (High Security)
```yaml
drift_policy:
mode: aggressive
rules:
- any_reachability_increase: alert_immediately
- runtime_first_observation: alert_critical
- contested_state: require_investigation
- auto_downgrade: disabled
```
### Balanced
```yaml
drift_policy:
mode: balanced
rules:
- critical_cve_reachability_increase: alert_high
- high_cve_runtime_observation: alert_medium
- contested_state: alert_medium
- auto_downgrade:
enabled: true
observation_gap: 30d
confidence_threshold: 0.9
```
### Permissive
```yaml
drift_policy:
mode: permissive
rules:
- runtime_observation_critical: alert_high
- other_increases: log_only
- auto_downgrade:
enabled: true
observation_gap: 14d
```
## Error Handling
| Error | Recovery |
|-------|----------|
| Signal collection gap | Use last known state, note uncertainty |
| State conflict | Mark as Contested, require review |
| Alert delivery failure | Queue for retry |
| Scan not found | Skip re-evaluation, log warning |
## Observability
### Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `reachability_drift_events_total` | Counter | `transition_type`, `risk_impact` |
| `reachability_state_transitions_total` | Counter | `from_state`, `to_state` |
| `drift_alert_sent_total` | Counter | `priority`, `channel` |
| `drift_detection_latency_ms` | Histogram | - |
### Key Log Events
| Event | Level | Fields |
|-------|-------|--------|
| `reach.state_transition` | INFO | `package`, `from`, `to` |
| `reach.drift_detected` | WARN | `drift_id`, `impact` |
| `reach.verdict_changed` | WARN | `scan_id`, `previous`, `new` |
| `reach.alert_sent` | INFO | `alert_id`, `priority` |
## Related Flows
- [Policy Evaluation Flow](04-policy-evaluation-flow.md) - K4 lattice details
- [Advisory Drift Re-scan Flow](11-advisory-drift-rescan-flow.md) - Similar re-evaluation
- [Risk Score Dashboard Flow](18-risk-score-dashboard-flow.md) - Risk impact
- [Notification Flow](05-notification-flow.md) - Alert delivery