8.1 KiB
8.1 KiB
Runtime Instrumentation Architecture
Technical architecture for the eBPF event adapter bridging Tetragon into Stella Ops.
Overview
The Runtime Instrumentation module is a stream-processing library that connects to the Tetragon eBPF agent via gRPC, receives raw kernel and user-space events, and converts them into the platform's canonical RuntimeCallEvent format. It does not expose HTTP endpoints or maintain a database -- it is consumed as a library by services that need runtime observation data (Signals, Scanner, Policy). The adapter decouples the rest of the platform from Tetragon's wire format and probe semantics.
Design Principles
- Provider abstraction - Downstream modules consume
RuntimeCallEvent, not Tetragon-specific types; replacing the eBPF agent requires only a new adapter - Privacy by default - Sensitive data is filtered at the adapter boundary before events propagate into the platform
- Minimal allocation - Event conversion is designed for high-throughput streaming with minimal object allocation
- Deterministic canonicalization - Stack frame normalization produces stable, comparable output regardless of ASLR or load order
Components
RuntimeInstrumentation/
├── StellaOps.RuntimeInstrumentation.Tetragon/ # Core adapter library
│ ├── TetragonEventAdapter.cs # Raw event -> RuntimeCallEvent conversion
│ ├── Models/
│ │ ├── TetragonEvent.cs # Raw Tetragon event representation
│ │ ├── RuntimeCallEvent.cs # Canonical platform event
│ │ ├── CanonicalStackFrame.cs # Normalized stack frame
│ │ └── ProbeType.cs # eBPF probe type enumeration
│ ├── StackCanonicalization/
│ │ ├── StackFrameCanonicalizer.cs # Symbol resolution and normalization
│ │ └── SymbolResolver.cs # Address-to-symbol mapping
│ ├── Privacy/
│ │ └── PrivacyFilter.cs # Sensitive data stripping
│ └── HotSymbol/
│ └── HotSymbolPublisher.cs # Publishes observed symbols to index
│
├── StellaOps.Agent.Tetragon/ # gRPC client for Tetragon agent
│ ├── TetragonGrpcClient.cs # gRPC stream consumer
│ ├── TetragonStreamReader.cs # Backpressure-aware stream reader
│ └── Proto/ # Tetragon protobuf definitions
│
└── __Tests/
└── StellaOps.RuntimeInstrumentation.Tests/ # Unit tests with fixture events
Core Models
RuntimeCallEvent (canonical output)
public sealed record RuntimeCallEvent
{
public required string EventId { get; init; }
public required DateTimeOffset Timestamp { get; init; }
public required ProbeType ProbeType { get; init; }
public required ProcessInfo Process { get; init; }
public ThreadInfo? Thread { get; init; }
public required string Syscall { get; init; }
public IReadOnlyList<CanonicalStackFrame> StackFrames { get; init; }
public string? ContainerId { get; init; }
public string? PodName { get; init; }
public string? Namespace { get; init; }
}
CanonicalStackFrame
public sealed record CanonicalStackFrame
{
public required string Module { get; init; }
public required string Symbol { get; init; }
public ulong Offset { get; init; }
public bool IsKernelSpace { get; init; }
public string? SourceFile { get; init; }
public int? LineNumber { get; init; }
}
ProbeType Enumeration
| Probe Type | Description | Origin |
|---|---|---|
ProcessExec |
New process execution | Tetragon process tracker |
ProcessExit |
Process termination | Tetragon process tracker |
Kprobe |
Kernel function entry | Kernel dynamic tracing |
Kretprobe |
Kernel function return | Kernel dynamic tracing |
Uprobe |
User-space function entry | User-space dynamic tracing |
Uretprobe |
User-space function return | User-space dynamic tracing |
Tracepoint |
Static kernel tracepoint | Kernel static tracing |
USDT |
User-space static tracepoint | Application-defined probes |
Fentry |
Kernel function entry (BPF trampoline) | Modern kernel tracing (5.5+) |
Fexit |
Kernel function exit (BPF trampoline) | Modern kernel tracing (5.5+) |
Data Flow
[Tetragon Agent]
│
│ gRPC stream (protobuf)
▼
[TetragonGrpcClient]
│
│ TetragonEvent (raw)
▼
[TetragonEventAdapter]
│
├── [StackFrameCanonicalizer] ── symbol resolution ──> CanonicalStackFrame[]
│
├── [PrivacyFilter] ── strip sensitive data
│
├── [HotSymbolPublisher] ── publish to hot symbol index
│
▼
[RuntimeCallEvent] (canonical)
│
├──> [Signals] (RTS scoring)
├──> [Scanner] (reachability validation)
└──> [Policy] (runtime evidence)
- Stream connection:
TetragonGrpcClientestablishes a persistent gRPC stream to the Tetragon agent running on the same node. - Raw event ingestion:
TetragonStreamReaderreads events with backpressure handling; if the consumer falls behind, oldest events are dropped with a metric increment. - Adaptation:
TetragonEventAdaptermaps the rawTetragonEventto aRuntimeCallEvent, invoking the stack canonicalizer and privacy filter. - Stack canonicalization:
StackFrameCanonicalizerresolves addresses to symbols using theSymbolResolver, normalizes module paths, and separates kernel-space from user-space frames. - Privacy filtering:
PrivacyFilterremoves or redacts environment variables, sensitive command-line arguments, and file paths matching configurable patterns. - Symbol publishing:
HotSymbolPublisheremits observed symbols to the hot symbol index, enabling runtime reachability correlation without requiring full re-analysis. - Downstream consumption: The resulting
RuntimeCallEventstream is consumed by Signals (for RTS scoring), Scanner (for reachability validation), and Policy (for runtime evidence in verdicts).
Security Considerations
- Privacy filtering: All events pass through
PrivacyFilterbefore leaving the instrumentation boundary. Configurable patterns control what gets redacted (default: environment variables, home directory paths, credential file paths). - Kernel vs user-space separation:
CanonicalStackFrame.IsKernelSpaceflag ensures downstream consumers can distinguish privilege levels and avoid conflating kernel internals with application code. - No credential exposure: The gRPC connection to Tetragon uses mTLS when available; connection parameters are configured via environment variables or mounted secrets, not hardcoded.
- Minimal privilege: The adapter library itself requires no elevated privileges; only the Tetragon agent (running as a DaemonSet) requires kernel access.
Performance Characteristics
- Throughput target: Sustain 50,000 events/second per node without dropping events under normal load
- Latency: Event-to-canonical conversion target under 1ms per event
- Backpressure: When the consumer cannot keep up,
TetragonStreamReaderapplies backpressure via gRPC flow control; persistent overload triggers event dropping withevents_dropped_totalmetric - Memory: Pooled buffers for protobuf deserialization to minimize GC pressure
Observability
- Metrics:
runtime_events_received_total{probe_type},runtime_events_converted_total,runtime_events_dropped_total,runtime_event_conversion_duration_ms,hot_symbols_published_total - Logs: Structured logs with
eventId,probeType,containerId,processName - Health: gRPC connection status and stream lag exposed for monitoring
References
- Module README
- Signals Architecture - RTS scoring consumer
- Scanner Architecture - Reachability validation