7.0 KiB
Runtime Linkage Verification - Operational Runbook
Audience: Platform operators, SREs, security engineers Related: Runtime Linkage Guide, Function Map V1 Contract
Overview
This runbook covers production deployment and operation of the runtime linkage verification system. The system uses eBPF probes to observe function calls and verifies them against declared function maps.
Prerequisites
- Linux kernel 5.8+ (for eBPF CO-RE support)
CAP_BPFandCAP_PERFMONcapabilities for the runtime agent- BTF (BPF Type Format) enabled in kernel config
- Stella runtime agent deployed as a DaemonSet or sidecar
Deployment
Runtime Agent Configuration
The Stella runtime agent (stella-runtime-agent) attaches eBPF probes based on function map predicates. Configuration via environment or YAML:
runtime_agent:
observation_store:
type: "memory" # or "postgres", "valkey"
retention_hours: 72
max_batch_size: 1000
probes:
max_concurrent: 256
attach_timeout_ms: 5000
default_types: ["uprobe", "kprobe"]
export:
format: "ndjson"
flush_interval_ms: 5000
output_path: "/var/stella/observations/"
Probe Selection Guidance
| Category | Probe Type | Use Case |
|---|---|---|
| Crypto functions | uprobe |
OpenSSL/BoringSSL/libsodium calls |
| Network I/O | kprobe |
connect/sendto/recvfrom syscalls |
| Auth flows | uprobe |
PAM/LDAP/OAuth library calls |
| File access | kprobe |
open/read/write on sensitive paths |
| TLS handshake | uprobe |
SSL_do_handshake, TLS negotiation |
Prioritization:
- Start with crypto and auth paths (highest security relevance)
- Add network I/O for service mesh verification
- Expand to file access for compliance requirements
Resource Overhead
Expected overhead per probe:
- CPU: ~0.1-0.5% per active uprobe (per-call overhead ~100ns)
- Memory: ~2KB per attached probe + observation buffer
- Disk: ~100 bytes per observation record (NDJSON)
Recommended limits:
- Max 256 concurrent probes per node
- Observation buffer: 64MB
- Flush interval: 5 seconds
- Retention: 72 hours (configurable)
Operations
Generating Function Maps
Run generation as part of CI/CD pipeline after SBOM generation:
# In CI after SBOM generation
stella function-map generate \
--sbom ${BUILD_DIR}/sbom.cdx.json \
--service ${SERVICE_NAME} \
--hot-functions "crypto/*" --hot-functions "net/*" --hot-functions "auth/*" \
--min-rate 0.95 \
--window 1800 \
--build-id ${CI_BUILD_ID} \
--output ${BUILD_DIR}/function-map.json
Store the function map alongside the container image (OCI referrer or artifact registry).
Continuous Verification
Set up periodic verification (cron or controller loop):
# Every 30 minutes, verify the last hour of observations
stella function-map verify \
--function-map /etc/stella/function-map.json \
--from "$(date -d '1 hour ago' -Iseconds)" \
--to "$(date -Iseconds)" \
--format json --output /var/stella/verification/latest.json
Monitoring
Key metrics to alert on:
| Metric | Threshold | Action |
|---|---|---|
observation_rate |
< 0.80 | Warning: coverage dropping |
observation_rate |
< 0.50 | Critical: significant coverage loss |
unexpected_symbols_count |
> 0 | Investigate: undeclared functions executing |
probe_attach_failures |
> 5% | Warning: probe attachment issues |
observation_buffer_full |
true | Critical: observations being dropped |
Alert Configuration
alerts:
- name: "function-map-coverage-low"
condition: observation_rate < 0.80
severity: warning
description: "Function map coverage below 80% for {service}"
runbook: "Check probe attachment, verify no binary update without map regeneration"
- name: "function-map-unexpected-calls"
condition: unexpected_symbols_count > 0
severity: info
description: "Unexpected function calls detected in {service}"
runbook: "Review unexpected symbols, regenerate function map if benign"
- name: "function-map-probe-failures"
condition: probe_attach_failure_rate > 0.05
severity: warning
description: "Probe attachment failure rate above 5%"
runbook: "Check kernel version, verify BTF availability, check CAP_BPF"
Performance Tuning
High-Traffic Services
For services with >10K calls/second on probed functions:
-
Sampling: Configure observation sampling rate:
probes: sampling_rate: 0.01 # 1% of calls -
Aggregation: Use count-based observations instead of per-call:
export: aggregation_window_ms: 1000 # Aggregate per second -
Selective probing: Use
--hot-functionsto limit to critical paths only
Large Function Maps
For maps with >100 expected paths:
- Tag paths by priority:
crypto>auth>network>general - Mark low-priority paths as
optional: true - Set per-tag minimum rates if needed
Storage Optimization
For long-term observation storage:
- Enable retention pruning:
pruneOlderThanAsync(72h) - Compress archived observations (gzip NDJSON)
- Use dedicated Postgres partitions by date for query performance
Incident Response
Coverage Dropped After Deployment
- Check if binary was updated without regenerating the function map
- Verify probes are still attached:
stella observations query --summary - Check for symbol changes (ASLR, different build)
- Regenerate function map from new SBOM and redeploy
Unexpected Symbols Detected
- Identify the unexpected functions from the verification report
- Determine if they are:
- Benign: Dynamic dispatch, plugins, lazy-loaded libraries → add to map
- Suspicious: Unexpected crypto usage, network calls → escalate to security team
- If benign, regenerate function map with broader patterns
- If suspicious, correlate with vulnerability findings and open incident
Probe Attachment Failures
- Check kernel version:
uname -r(need 5.8+) - Verify BTF:
ls /sys/kernel/btf/vmlinux - Check capabilities:
capsh --print | grep bpf - Check binary paths: verify
binary_pathin function map matches deployed binary - Check for SELinux/AppArmor blocking BPF operations
Air-Gap Considerations
For air-gapped environments:
-
Bundle generation (connected side):
stella function-map generate --sbom app.cdx.json --service my-service --output fm.json # Package with observations tar czf linkage-bundle.tgz fm.json observations/*.ndjson -
Transfer via approved media to air-gapped environment
-
Offline verification (air-gapped side):
stella function-map verify --function-map fm.json --offline --observations obs.ndjson -
Result export for compliance reporting:
stella function-map verify ... --format json --output report.json # Sign the report stella attest sign --input report.json --output report.dsse.json