# Runtime Linkage Verification - Operational Runbook > **Audience:** Platform operators, SREs, security engineers > **Related:** [Runtime Linkage Guide](../modules/scanner/guides/runtime-linkage.md), [Function Map V1 Contract](../contracts/function-map-v1.md) ## Overview This runbook covers production deployment and operation of the runtime linkage verification system. The system uses eBPF probes to observe function calls and verifies them against declared function maps. --- ## Prerequisites - Linux kernel 5.8+ (for eBPF CO-RE support) - `CAP_BPF` and `CAP_PERFMON` capabilities for the runtime agent - BTF (BPF Type Format) enabled in kernel config - Stella runtime agent deployed as a DaemonSet or sidecar --- ## Deployment ### Runtime Agent Configuration The Stella runtime agent (`stella-runtime-agent`) attaches eBPF probes based on function map predicates. Configuration via environment or YAML: ```yaml runtime_agent: observation_store: type: "memory" # or "postgres", "valkey" retention_hours: 72 max_batch_size: 1000 probes: max_concurrent: 256 attach_timeout_ms: 5000 default_types: ["uprobe", "kprobe"] export: format: "ndjson" flush_interval_ms: 5000 output_path: "/var/stella/observations/" ``` ### Probe Selection Guidance | Category | Probe Type | Use Case | |----------|-----------|----------| | Crypto functions | `uprobe` | OpenSSL/BoringSSL/libsodium calls | | Network I/O | `kprobe` | connect/sendto/recvfrom syscalls | | Auth flows | `uprobe` | PAM/LDAP/OAuth library calls | | File access | `kprobe` | open/read/write on sensitive paths | | TLS handshake | `uprobe` | SSL_do_handshake, TLS negotiation | **Prioritization:** 1. Start with crypto and auth paths (highest security relevance) 2. Add network I/O for service mesh verification 3. Expand to file access for compliance requirements ### Resource Overhead Expected overhead per probe: - CPU: ~0.1-0.5% per active uprobe (per-call overhead ~100ns) - Memory: ~2KB per attached probe + observation buffer - Disk: ~100 bytes per observation record (NDJSON) **Recommended limits:** - Max 256 concurrent probes per node - Observation buffer: 64MB - Flush interval: 5 seconds - Retention: 72 hours (configurable) --- ## Operations ### Generating Function Maps Run generation as part of CI/CD pipeline after SBOM generation: ```bash # In CI after SBOM generation stella function-map generate \ --sbom ${BUILD_DIR}/sbom.cdx.json \ --service ${SERVICE_NAME} \ --hot-functions "crypto/*" --hot-functions "net/*" --hot-functions "auth/*" \ --min-rate 0.95 \ --window 1800 \ --build-id ${CI_BUILD_ID} \ --output ${BUILD_DIR}/function-map.json ``` Store the function map alongside the container image (OCI referrer or artifact registry). ### Continuous Verification Set up periodic verification (cron or controller loop): ```bash # Every 30 minutes, verify the last hour of observations stella function-map verify \ --function-map /etc/stella/function-map.json \ --from "$(date -d '1 hour ago' -Iseconds)" \ --to "$(date -Iseconds)" \ --format json --output /var/stella/verification/latest.json ``` ### Monitoring Key metrics to alert on: | Metric | Threshold | Action | |--------|-----------|--------| | `observation_rate` | < 0.80 | Warning: coverage dropping | | `observation_rate` | < 0.50 | Critical: significant coverage loss | | `unexpected_symbols_count` | > 0 | Investigate: undeclared functions executing | | `probe_attach_failures` | > 5% | Warning: probe attachment issues | | `observation_buffer_full` | true | Critical: observations being dropped | ### Alert Configuration ```yaml alerts: - name: "function-map-coverage-low" condition: observation_rate < 0.80 severity: warning description: "Function map coverage below 80% for {service}" runbook: "Check probe attachment, verify no binary update without map regeneration" - name: "function-map-unexpected-calls" condition: unexpected_symbols_count > 0 severity: info description: "Unexpected function calls detected in {service}" runbook: "Review unexpected symbols, regenerate function map if benign" - name: "function-map-probe-failures" condition: probe_attach_failure_rate > 0.05 severity: warning description: "Probe attachment failure rate above 5%" runbook: "Check kernel version, verify BTF availability, check CAP_BPF" ``` --- ## Performance Tuning ### High-Traffic Services For services with >10K calls/second on probed functions: 1. **Sampling:** Configure observation sampling rate: ```yaml probes: sampling_rate: 0.01 # 1% of calls ``` 2. **Aggregation:** Use count-based observations instead of per-call: ```yaml export: aggregation_window_ms: 1000 # Aggregate per second ``` 3. **Selective probing:** Use `--hot-functions` to limit to critical paths only ### Large Function Maps For maps with >100 expected paths: 1. Tag paths by priority: `crypto` > `auth` > `network` > `general` 2. Mark low-priority paths as `optional: true` 3. Set per-tag minimum rates if needed ### Storage Optimization For long-term observation storage: 1. Enable retention pruning: `pruneOlderThanAsync(72h)` 2. Compress archived observations (gzip NDJSON) 3. Use dedicated Postgres partitions by date for query performance --- ## Incident Response ### Coverage Dropped After Deployment 1. Check if binary was updated without regenerating the function map 2. Verify probes are still attached: `stella observations query --summary` 3. Check for symbol changes (ASLR, different build) 4. Regenerate function map from new SBOM and redeploy ### Unexpected Symbols Detected 1. Identify the unexpected functions from the verification report 2. Determine if they are: - **Benign:** Dynamic dispatch, plugins, lazy-loaded libraries → add to map - **Suspicious:** Unexpected crypto usage, network calls → escalate to security team 3. If benign, regenerate function map with broader patterns 4. If suspicious, correlate with vulnerability findings and open incident ### Probe Attachment Failures 1. Check kernel version: `uname -r` (need 5.8+) 2. Verify BTF: `ls /sys/kernel/btf/vmlinux` 3. Check capabilities: `capsh --print | grep bpf` 4. Check binary paths: verify `binary_path` in function map matches deployed binary 5. Check for SELinux/AppArmor blocking BPF operations --- ## Air-Gap Considerations For air-gapped environments: 1. **Bundle generation** (connected side): ```bash stella function-map generate --sbom app.cdx.json --service my-service --output fm.json # Package with observations tar czf linkage-bundle.tgz fm.json observations/*.ndjson ``` 2. **Transfer** via approved media to air-gapped environment 3. **Offline verification** (air-gapped side): ```bash stella function-map verify --function-map fm.json --offline --observations obs.ndjson ``` 4. **Result export** for compliance reporting: ```bash stella function-map verify ... --format json --output report.json # Sign the report stella attest sign --input report.json --output report.dsse.json ```