233 lines
7.5 KiB
Markdown
233 lines
7.5 KiB
Markdown
# eBPF Reachability Architecture
|
|
|
|
## System Overview
|
|
|
|
The eBPF reachability system captures kernel-level events to provide cryptographic proof of runtime behavior. It uses Linux eBPF (extended Berkeley Packet Filter) with CO-RE (Compile Once, Run Everywhere) for portable deployment across kernel versions.
|
|
|
|
## Design Principles
|
|
|
|
1. **Minimal Kernel Footprint**: eBPF programs perform only essential filtering and data capture
|
|
2. **User-Space Enrichment**: Complex lookups (symbols, containers, SBOMs) happen in user space
|
|
3. **Deterministic Output**: Same inputs produce byte-identical NDJSON output
|
|
4. **Chain of Custody**: Every evidence chunk is cryptographically signed and linked
|
|
|
|
## Component Architecture
|
|
|
|
### Kernel-Space Components
|
|
|
|
#### Ring Buffer (`BPF_MAP_TYPE_RINGBUF`)
|
|
- Single shared buffer for all event types (default 256KB)
|
|
- Lock-free, multi-producer design
|
|
- Automatic backpressure via `bpf_ringbuf_reserve()` failures
|
|
|
|
#### Tracepoint Probes
|
|
| Probe | Event Type | Purpose |
|
|
|-------|------------|---------|
|
|
| `tracepoint/syscalls/sys_enter_openat` | File access | Track which files are opened |
|
|
| `tracepoint/sched/sched_process_exec` | Process execution | Track binary invocations |
|
|
| `tracepoint/sock/inet_sock_set_state` | TCP state | Track network connections |
|
|
|
|
#### Uprobe Probes
|
|
| Probe | Library | Purpose |
|
|
|-------|---------|---------|
|
|
| `uprobe/libc.so:connect` | glibc/musl | Outbound network connections |
|
|
| `uprobe/libc.so:accept` | glibc/musl | Inbound connections |
|
|
| `uprobe/libssl.so:SSL_read` | OpenSSL | TLS traffic monitoring |
|
|
| `uprobe/libssl.so:SSL_write` | OpenSSL | TLS traffic monitoring |
|
|
|
|
#### BPF Maps for Filtering
|
|
```c
|
|
// Cgroup filter for container targeting
|
|
struct {
|
|
__uint(type, BPF_MAP_TYPE_HASH);
|
|
__uint(max_entries, 1024);
|
|
__type(key, u64); // cgroup_id
|
|
__type(value, u8); // 1 = include
|
|
} cgroup_filter SEC(".maps");
|
|
|
|
// Namespace filter for multi-tenant isolation
|
|
struct {
|
|
__uint(type, BPF_MAP_TYPE_HASH);
|
|
__uint(max_entries, 256);
|
|
__type(key, u64); // namespace inode
|
|
__type(value, u8); // 1 = include
|
|
} namespace_filter SEC(".maps");
|
|
```
|
|
|
|
### User-Space Components
|
|
|
|
#### CoreProbeLoader
|
|
Manages eBPF program lifecycle:
|
|
- Loads compiled `.bpf.o` files via libbpf
|
|
- Attaches probes to tracepoints/uprobes
|
|
- Configures BPF maps for filtering
|
|
- Handles graceful detachment and cleanup
|
|
|
|
#### EventParser
|
|
Parses binary events from ring buffer:
|
|
- Fixed-size header with event type discriminator
|
|
- Type-specific payload parsing
|
|
- Timestamp normalization (boot time to wall clock)
|
|
|
|
#### CgroupContainerResolver
|
|
Maps kernel cgroup IDs to container identities:
|
|
- Parses `/proc/{pid}/cgroup` for container runtime paths
|
|
- Supports containerd, Docker, CRI-O path formats
|
|
- Caches mappings with configurable TTL
|
|
|
|
#### EnhancedSymbolResolver
|
|
Resolves addresses to human-readable symbols:
|
|
- Parses `/proc/{pid}/maps` for ASLR offsets
|
|
- Reads ELF symbol tables (`.symtab`, `.dynsym`)
|
|
- Optional DWARF debug info for line numbers
|
|
- LRU cache with bounded memory usage
|
|
|
|
#### RuntimeEventEnricher
|
|
Decorates events with container and SBOM metadata:
|
|
- Container ID and image digest correlation
|
|
- SBOM component (PURL) lookup
|
|
- Graceful degradation on missing metadata
|
|
|
|
#### RuntimeEvidenceNdjsonWriter
|
|
Produces deterministic NDJSON output:
|
|
- Canonical JSON serialization (sorted keys, no whitespace variance)
|
|
- Rolling BLAKE3 hash for content verification
|
|
- Size and time-based rotation with callbacks
|
|
|
|
#### EvidenceChunkFinalizer
|
|
Signs and links evidence chunks:
|
|
- Creates in-toto statements with chunk metadata
|
|
- Requests DSSE signatures via Signer service
|
|
- Submits to Rekor transparency log
|
|
- Maintains chain state (previous_chunk_id linkage)
|
|
|
|
## Data Flow
|
|
|
|
```
|
|
1. Kernel Event
|
|
│
|
|
├─► Tracepoint/Uprobe fires
|
|
│ └─► BPF program captures event data
|
|
│ └─► Filter by cgroup/namespace (optional)
|
|
│ └─► Submit to ring buffer
|
|
│
|
|
2. Ring Buffer Drain
|
|
│
|
|
├─► EventParser reads binary data
|
|
│ └─► Deserialize to typed event struct
|
|
│ └─► Validate event integrity
|
|
│
|
|
3. Resolution & Enrichment
|
|
│
|
|
├─► CgroupResolver: cgroup_id → container_id
|
|
├─► SymbolResolver: address → symbol name
|
|
├─► StateProvider: container_id → image_ref
|
|
├─► DigestResolver: image_ref → image_digest
|
|
└─► SbomProvider: image_digest → purls[]
|
|
│
|
|
4. Serialization
|
|
│
|
|
├─► RuntimeEvidenceNdjsonWriter
|
|
│ ├─► Canonical JSON serialization
|
|
│ ├─► Append to current chunk file
|
|
│ └─► Update rolling hash
|
|
│
|
|
5. Rotation & Signing
|
|
│
|
|
├─► Size/time threshold reached
|
|
│ └─► Close current chunk
|
|
│ └─► ChunkFinalizer
|
|
│ ├─► Create in-toto statement
|
|
│ ├─► Sign with DSSE
|
|
│ ├─► Submit to Rekor
|
|
│ └─► Link to previous chunk
|
|
│
|
|
6. Verification
|
|
│
|
|
└─► stella signals verify-chain
|
|
├─► Parse DSSE envelopes
|
|
├─► Verify signatures
|
|
├─► Check chain linkage
|
|
└─► Validate time monotonicity
|
|
```
|
|
|
|
## Performance Characteristics
|
|
|
|
### Kernel-Space
|
|
- Ring buffer prevents event loss under load (backpressure)
|
|
- In-kernel filtering reduces user-space processing
|
|
- BTF enables zero-copy field access
|
|
|
|
### User-Space
|
|
| Operation | Target Latency |
|
|
|-----------|---------------|
|
|
| Cached symbol lookup | < 1ms p99 |
|
|
| Uncached symbol lookup | < 10ms p99 |
|
|
| Container enrichment | < 10ms p99 |
|
|
| NDJSON write | < 1ms p99 |
|
|
|
|
### Throughput
|
|
- Target: 100,000 events/second sustained
|
|
- Rate limiting available for resource-constrained environments
|
|
|
|
## Memory Budget
|
|
|
|
| Component | Default | Configurable |
|
|
|-----------|---------|--------------|
|
|
| Ring buffer | 256 KB | Yes |
|
|
| Symbol cache | 100,000 entries | Yes |
|
|
| Container cache | 5 min TTL | Yes |
|
|
| Write buffer | 64 KB | Yes |
|
|
|
|
## Failure Modes
|
|
|
|
### Ring Buffer Overflow
|
|
- **Symptom**: Events dropped, warning logged
|
|
- **Mitigation**: Increase buffer size or enable rate limiting
|
|
|
|
### Symbol Resolution Failure
|
|
- **Symptom**: Address shown as `addr:0x{hex}`
|
|
- **Mitigation**: Ensure debug symbols available or accept address-only evidence
|
|
|
|
### Container Resolution Failure
|
|
- **Symptom**: `container_id = "unknown:{cgroup_id}"`
|
|
- **Mitigation**: Verify Zastava integration, check cgroup path format support
|
|
|
|
### Signing Failure
|
|
- **Symptom**: Chunk saved without signature, warning logged
|
|
- **Mitigation**: Check Signer service availability, verify Fulcio/KMS connectivity
|
|
|
|
## CO-RE (Compile Once, Run Everywhere)
|
|
|
|
The system uses BTF (BPF Type Format) for kernel-version-independent field access:
|
|
|
|
```c
|
|
// Access kernel struct fields without hardcoded offsets
|
|
struct task_struct *task = (void *)bpf_get_current_task();
|
|
pid_t pid = BPF_CORE_READ(task, pid);
|
|
pid_t tgid = BPF_CORE_READ(task, tgid);
|
|
```
|
|
|
|
**Requirements:**
|
|
- Kernel 5.2+ with built-in BTF (recommended)
|
|
- Kernel 4.14+ with external BTF from btfhub
|
|
|
|
## Integration Points
|
|
|
|
### Zastava (Container State)
|
|
- `IContainerIdentityResolver` interface
|
|
- Container lifecycle events (start/stop)
|
|
- Image reference to digest mapping
|
|
|
|
### Scanner (Reachability Merger)
|
|
- `EbpfSignalMerger` combines runtime with static analysis
|
|
- Symbol hash correlation via `RuntimeNodeHash`
|
|
|
|
### Signer (Evidence Signing)
|
|
- `IAttestationSigningService` for DSSE signatures
|
|
- `IRekorClient` for transparency log submission
|
|
|
|
### SBOM Service (Component Correlation)
|
|
- `ISbomComponentProvider` for PURL lookup
|
|
- Image digest to component mapping
|