Files
git.stella-ops.org/docs/flows/19-reachability-drift-alert-flow.md
StellaOps Bot ca578801fd save progress
2026-01-03 00:49:19 +02:00

17 KiB

Reachability Drift Alert Flow

Overview

The Reachability Drift Alert Flow describes how StellaOps detects and alerts on changes in code reachability that affect vulnerability risk assessments. When runtime observations or static analysis reveal that previously unreachable vulnerable code has become reachable (or vice versa), this flow triggers re-evaluation and notifications.

Business Value: Catch newly reachable vulnerabilities before they're exploited, and reduce alert fatigue by downgrading unreachable vulnerabilities automatically.

Actors

Actor Type Role
Signals Service Collects runtime telemetry
ReachGraph Service Analyzes reachability state
Scanner Service Re-evaluates findings
Policy Engine Service Re-evaluates verdicts
Notify Service Sends drift alerts
Scheduler Service Orchestrates periodic checks

Prerequisites

  • Runtime instrumentation deployed (eBPF agent or OpenTelemetry)
  • Baseline reachability analysis completed
  • Drift detection policies configured
  • Alert channels configured

Reachability State Transitions

From State To State Risk Impact Alert Priority
Unknown → StaticallyReachable Increased Medium
Unknown → RuntimeObserved Increased High
StaticallyUnreachable → StaticallyReachable Increased Medium
StaticallyReachable → RuntimeObserved Confirmed High
RuntimeObserved → ConfirmedReachable Confirmed High
StaticallyReachable → StaticallyUnreachable Decreased Low
RuntimeObserved → RuntimeUnobserved Decreased Medium
Any → ConfirmedUnreachable Decreased Low
Any → Contested Review needed High

Flow Diagram

┌─────────────────────────────────────────────────────────────────────────────────┐
│                      Reachability Drift Alert Flow                               │
└─────────────────────────────────────────────────────────────────────────────────┘

┌─────────┐  ┌───────────┐  ┌───────────┐  ┌─────────┐  ┌────────┐  ┌────────┐
│ Signals │  │ ReachGraph│  │ Scheduler │  │ Scanner │  │ Policy │  │ Notify │
└────┬────┘  └─────┬─────┘  └─────┬─────┘  └────┬────┘  └───┬────┘  └───┬────┘
     │             │              │             │           │           │
     │ Runtime     │              │             │           │           │
     │ event       │              │             │           │           │
     │────────────>│              │             │           │           │
     │             │              │             │           │           │
     │             │ Update       │             │           │           │
     │             │ call graph   │             │           │           │
     │             │───┐          │             │           │           │
     │             │   │          │             │           │           │
     │             │<──┘          │             │           │           │
     │             │              │             │           │           │
     │             │ Detect       │             │           │           │
     │             │ state change │             │           │           │
     │             │───┐          │             │           │           │
     │             │   │          │             │           │           │
     │             │<──┘          │             │           │           │
     │             │              │             │           │           │
     │             │ [If state    │             │           │           │
     │             │  changed]    │             │           │           │
     │             │              │             │           │           │
     │             │ Emit drift   │             │           │           │
     │             │ event        │             │           │           │
     │             │─────────────>│             │           │           │
     │             │              │             │           │           │
     │             │              │ Queue       │           │           │
     │             │              │ re-eval     │           │           │
     │             │              │────────────>│           │           │
     │             │              │             │           │           │
     │             │              │             │ Load scan │           │
     │             │              │             │ + findings│           │
     │             │              │             │───┐       │           │
     │             │              │             │   │       │           │
     │             │              │             │<──┘       │           │
     │             │              │             │           │           │
     │             │              │             │ Update    │           │
     │             │              │             │ reach     │           │
     │             │              │             │ states    │           │
     │             │              │             │───┐       │           │
     │             │              │             │   │       │           │
     │             │              │             │<──┘       │           │
     │             │              │             │           │           │
     │             │              │             │ Re-eval   │           │
     │             │              │             │──────────>│           │
     │             │              │             │           │           │
     │             │              │             │           │ Compare   │
     │             │              │             │           │ verdicts  │
     │             │              │             │           │───┐       │
     │             │              │             │           │   │       │
     │             │              │             │           │<──┘       │
     │             │              │             │           │           │
     │             │              │             │ New       │           │
     │             │              │             │ verdict   │           │
     │             │              │             │<──────────│           │
     │             │              │             │           │           │
     │             │              │ [If verdict │           │           │
     │             │              │  changed]   │           │           │
     │             │              │             │           │           │
     │             │              │ Alert       │           │           │
     │             │              │─────────────────────────────────────>│
     │             │              │             │           │           │
     │             │              │             │           │           │ Send
     │             │              │             │           │           │ alert
     │             │              │             │           │           │───┐
     │             │              │             │           │           │   │
     │             │              │             │           │           │<──┘
     │             │              │             │           │           │

Step-by-Step

1. Runtime Event Collection

Signals service collects function invocation data:

{
  "event_type": "function_invocation",
  "timestamp": "2024-12-29T10:30:00Z",
  "source": "ebpf-agent",
  "payload": {
    "container_id": "abc123...",
    "image_digest": "sha256:...",
    "function": "lodash.template",
    "package": "pkg:npm/lodash@4.17.20",
    "call_stack": [
      "app/routes/render.js:45",
      "lib/template-engine.js:123",
      "node_modules/lodash/template.js:89"
    ],
    "invocation_count": 1
  }
}

2. Reachability State Update

ReachGraph updates K4 lattice state:

{
  "state_transition": {
    "package": "pkg:npm/lodash@4.17.20",
    "function": "lodash.template",
    "image_digest": "sha256:...",
    "previous_state": "StaticallyReachable",
    "new_state": "RuntimeObserved",
    "transition_reason": "first_runtime_invocation",
    "evidence": {
      "static": {
        "call_paths": 3,
        "entry_points": ["app/routes/render.js:45"]
      },
      "runtime": {
        "first_observed": "2024-12-29T10:30:00Z",
        "invocation_count": 1,
        "call_stack_hash": "sha256:stackhash..."
      }
    }
  }
}

3. Drift Detection

ReachGraph compares against baseline and detects drift:

{
  "drift_event": {
    "drift_id": "drift-123",
    "detected_at": "2024-12-29T10:30:01Z",
    "image_digest": "sha256:...",
    "package": "pkg:npm/lodash@4.17.20",
    "transition": {
      "from": "StaticallyReachable",
      "to": "RuntimeObserved"
    },
    "affected_vulnerabilities": [
      {
        "cve": "CVE-2024-1234",
        "severity": "critical",
        "previous_risk": "medium",
        "new_risk": "high"
      }
    ],
    "risk_impact": "increased",
    "alert_priority": "high"
  }
}

4. Re-evaluation Trigger

Scheduler queues affected scans for re-evaluation:

{
  "reevaluation_job": {
    "job_id": "reeval-456",
    "trigger": "reachability_drift",
    "drift_id": "drift-123",
    "scans": [
      {
        "scan_id": "scan-abc123",
        "image": "docker.io/myorg/app:v1.2.3",
        "affected_findings": ["CVE-2024-1234"]
      }
    ],
    "priority": "high"
  }
}

5. Finding Re-evaluation

Scanner updates findings with new reachability:

{
  "finding_update": {
    "scan_id": "scan-abc123",
    "cve": "CVE-2024-1234",
    "package": "pkg:npm/lodash@4.17.20",
    "previous": {
      "reachability": "StaticallyReachable",
      "confidence": 0.70
    },
    "updated": {
      "reachability": "RuntimeObserved",
      "confidence": 0.95,
      "evidence": {
        "runtime_observed_at": "2024-12-29T10:30:00Z",
        "call_stack": ["..."]
      }
    }
  }
}

6. Policy Re-evaluation

Policy engine re-evaluates with updated reachability:

{
  "verdict_comparison": {
    "scan_id": "scan-abc123",
    "previous_verdict": "WARN",
    "new_verdict": "FAIL",
    "verdict_changed": true,
    "changes": [
      {
        "finding": "CVE-2024-1234",
        "rule": "no-critical-reachable",
        "previous_result": "WARN (StaticallyReachable)",
        "new_result": "FAIL (RuntimeObserved)",
        "reason": "Runtime execution confirmed - elevated from warning to block"
      }
    ]
  }
}

7. Alert Generation

Notify sends drift alert:

{
  "alert": {
    "alert_id": "alert-789",
    "type": "reachability_drift",
    "priority": "high",
    "title": "Vulnerability Reachability Confirmed by Runtime",
    "body": {
      "summary": "CVE-2024-1234 in lodash@4.17.20 was observed at runtime",
      "image": "docker.io/myorg/app:v1.2.3",
      "cve": "CVE-2024-1234",
      "severity": "critical",
      "transition": "StaticallyReachable → RuntimeObserved",
      "impact": "Verdict changed from WARN to FAIL",
      "action_required": "Immediate remediation recommended",
      "remediation": "Upgrade lodash to 4.17.21"
    },
    "channels": ["slack", "pagerduty", "email"]
  }
}

Slack Alert Format

🚨 Vulnerability Reachability Confirmed

CVE: CVE-2024-1234 (Critical)
Package: lodash@4.17.20
Image: myorg/app:v1.2.3

State Change: StaticallyReachable → RuntimeObserved
Impact: Verdict changed WARN → FAIL

The vulnerable function `lodash.template` was invoked at runtime,
confirming the vulnerability is exploitable.

Action: Immediate remediation required
Fix: Upgrade lodash to 4.17.21

[View Details] [Create Ticket] [Add Exception]

Drift Detection Modes

Real-time Detection

drift_detection:
  mode: realtime
  config:
    latency_target: 5s
    buffer_window: 0
    immediate_alert_severity: [critical, high]

Batch Detection

drift_detection:
  mode: batch
  config:
    check_interval: 15m
    aggregate_similar: true
    min_invocations_for_transition: 3

Hybrid Detection

drift_detection:
  mode: hybrid
  config:
    realtime_for: [critical]
    batch_for: [high, medium, low]
    batch_interval: 1h

Downgrade Handling

When reachability decreases (e.g., code removed):

{
  "downgrade_event": {
    "drift_id": "drift-456",
    "type": "risk_decrease",
    "package": "pkg:npm/lodash@4.17.20",
    "cve": "CVE-2024-1234",
    "transition": {
      "from": "RuntimeObserved",
      "to": "RuntimeUnobserved",
      "observation_gap": "30d"
    },
    "risk_impact": "decreased",
    "action": "auto_downgrade",
    "verdict_change": "FAIL → WARN",
    "notification": "info"  // Lower priority for improvements
  }
}

Data Contracts

Drift Event Schema

interface ReachabilityDriftEvent {
  drift_id: string;
  detected_at: string;
  image_digest: string;
  package: string;
  function?: string;
  transition: {
    from: K4State;
    to: K4State;
    reason: string;
  };
  affected_vulnerabilities: Array<{
    cve: string;
    severity: string;
    previous_risk: string;
    new_risk: string;
  }>;
  risk_impact: 'increased' | 'decreased' | 'unchanged';
  alert_priority: 'critical' | 'high' | 'medium' | 'low' | 'info';
  evidence: {
    static?: StaticEvidence;
    runtime?: RuntimeEvidence;
  };
}

Drift Alert Schema

interface DriftAlert {
  alert_id: string;
  drift_id: string;
  type: 'reachability_drift';
  priority: 'critical' | 'high' | 'medium' | 'low';
  title: string;
  body: {
    summary: string;
    image: string;
    cve: string;
    severity: string;
    transition: string;
    impact: string;
    action_required?: string;
    remediation?: string;
  };
  channels: string[];
  sent_at: string;
  acknowledged_at?: string;
  resolved_at?: string;
}

Drift Policies

Aggressive (High Security)

drift_policy:
  mode: aggressive
  rules:
    - any_reachability_increase: alert_immediately
    - runtime_first_observation: alert_critical
    - contested_state: require_investigation
    - auto_downgrade: disabled

Balanced

drift_policy:
  mode: balanced
  rules:
    - critical_cve_reachability_increase: alert_high
    - high_cve_runtime_observation: alert_medium
    - contested_state: alert_medium
    - auto_downgrade:
        enabled: true
        observation_gap: 30d
        confidence_threshold: 0.9

Permissive

drift_policy:
  mode: permissive
  rules:
    - runtime_observation_critical: alert_high
    - other_increases: log_only
    - auto_downgrade:
        enabled: true
        observation_gap: 14d

Error Handling

Error Recovery
Signal collection gap Use last known state, note uncertainty
State conflict Mark as Contested, require review
Alert delivery failure Queue for retry
Scan not found Skip re-evaluation, log warning

Observability

Metrics

Metric Type Labels
reachability_drift_events_total Counter transition_type, risk_impact
reachability_state_transitions_total Counter from_state, to_state
drift_alert_sent_total Counter priority, channel
drift_detection_latency_ms Histogram -

Key Log Events

Event Level Fields
reach.state_transition INFO package, from, to
reach.drift_detected WARN drift_id, impact
reach.verdict_changed WARN scan_id, previous, new
reach.alert_sent INFO alert_id, priority