eBPF: Introduction and Use Cases - Extended Berkeley Packet Filter Guide

Introduction

eBPF (extended Berkeley Packet Filter) represents a revolutionary technology that fundamentally transforms how we interact with the Linux kernel. Originally designed for packet filtering, eBPF has evolved into a comprehensive framework enabling developers to run sandboxed programs within the kernel without modifying kernel source code or loading kernel modules. This capability unlocks unprecedented observability, security, and networking possibilities previously impossible or requiring risky kernel modifications.

Major technology companies including Facebook, Netflix, Cloudflare, and Google leverage eBPF extensively for performance monitoring, security enforcement, load balancing, and network optimization. Cloudflare processes millions of packets per second using XDP (eXpress Data Path) powered by eBPF for DDoS mitigation. Netflix monitors application performance across tens of thousands of servers using eBPF-based tracing. Cilium, the leading cloud-native networking and security solution, uses eBPF as its core technology for Kubernetes networking and security policy enforcement.

eBPF's significance extends beyond incremental improvements—it fundamentally changes how we approach kernel programming, observability, and security. Traditional kernel development required months of testing and validation before production deployment. eBPF enables dynamic instrumentation, allowing engineers to deploy monitoring, tracing, and security solutions in minutes without reboots, kernel module compilation, or system restarts.

This comprehensive guide explores eBPF fundamentals, architecture, practical use cases, development workflows, performance optimization, security implications, and troubleshooting methodologies essential for leveraging this transformative technology in production environments.

Theory and Core Concepts

eBPF Architecture

eBPF provides a virtual machine within the Linux kernel:

eBPF Virtual Machine: Executes eBPF bytecode with JIT (Just-In-Time) compilation to native machine code for performance. Provides safe execution environment with:

  • Bounded Loops: Prevents infinite loops ensuring termination
  • Memory Safety: Verifier ensures no unauthorized memory access
  • Limited Stack: 512-byte stack prevents resource exhaustion
  • Helper Functions: Controlled kernel function access via whitelisted helpers

eBPF Verifier: Analyzes programs before execution ensuring:

  • All code paths terminate (no infinite loops)
  • No unauthorized memory access
  • Register usage correctness
  • Proper bounds checking
  • Type safety

eBPF Maps: Kernel data structures enabling:

  • Communication between kernel and userspace
  • Data sharing between eBPF programs
  • Statistics collection and aggregation
  • Configuration storage

Map Types:

  • Hash Maps: Key-value storage
  • Arrays: Fixed-size indexed storage
  • Program Arrays: Store other eBPF programs
  • Per-CPU Maps: CPU-specific data avoiding synchronization
  • Ring Buffers: Efficient event streaming to userspace
  • LRU Maps: Automatic eviction of least-recently-used entries

eBPF Program Types

Different program types attach to various kernel hooks:

Socket Filters (BPF_PROG_TYPE_SOCKET_FILTER): Classic packet filtering attached to sockets. Original BPF use case.

Kprobes/Kretprobes: Dynamic instrumentation of kernel functions. Enable tracing any kernel function including entry and return values.

Tracepoints: Static instrumentation points maintained across kernel versions. More stable than kprobes.

XDP (eXpress Data Path): Processes packets at earliest point after driver reception. Enables high-performance packet processing for DDoS mitigation, load balancing, packet filtering.

TC (Traffic Control): Hooks into Linux TC subsystem for packet manipulation, redirection, and QoS.

Cgroup Programs: Attach to cgroup events for resource control, access control, and monitoring.

LSM (Linux Security Module) Hooks: Security enforcement at LSM hooks. Enables custom security policies.

Perf Events: Profile CPU performance events, hardware events, software events.

eBPF vs Traditional Approaches

Understanding eBPF advantages over alternatives:

vs Kernel Modules:

  • eBPF: Safe, verified, no kernel recompilation, hot-deployable
  • Kernel Modules: Unsafe, can crash kernel, require compilation, need reboot

vs Userspace Processing:

  • eBPF: Runs in kernel, zero context switches, high performance
  • Userspace: Context switch overhead, slower, limited kernel visibility

vs SystemTap/DTrace:

  • eBPF: Built into Linux kernel, portable, standard tooling
  • SystemTap: Requires kernel debuginfo, complex setup, stability issues

XDP Architecture

XDP deserves special attention for networking use cases:

XDP Processing Stages:

  1. Driver Mode: Fastest, processes packets before sk_buff allocation
  2. Generic Mode: Fallback, processes after sk_buff allocation
  3. Offload Mode: Offloads processing to NIC hardware (specific hardware support required)

XDP Actions:

  • XDP_DROP: Drop packet immediately (DDoS mitigation)
  • XDP_PASS: Continue normal kernel processing
  • XDP_TX: Transmit packet back on receiving interface
  • XDP_REDIRECT: Redirect to different interface or CPU
  • XDP_ABORTED: Error condition, drop packet and trace

Prerequisites

Kernel Requirements

Minimum Kernel Version: 4.4+ (5.x+ recommended for full feature support)

Verify eBPF Support:

# Check kernel version
uname -r

# Verify eBPF support
cat /proc/config.gz | gunzip | grep BPF
# or
grep BPF /boot/config-$(uname -r)

# Required options:
# CONFIG_BPF=y
# CONFIG_BPF_SYSCALL=y
# CONFIG_BPF_JIT=y
# CONFIG_HAVE_EBPF_JIT=y

Enable BPF JIT (if not enabled):

echo 1 > /proc/sys/net/core/bpf_jit_enable

# Make persistent
echo "net.core.bpf_jit_enable = 1" >> /etc/sysctl.d/99-bpf.conf

Software Prerequisites

Development Tools:

RHEL/Rocky/CentOS:

# Install LLVM/Clang for BPF compilation
dnf install -y clang llvm

# Install development headers
dnf install -y kernel-devel kernel-headers

# Install BPF compiler collection (BCC)
dnf install -y bcc-tools python3-bcc

# Install bpftool
dnf install -y bpftool

# Install libbpf development
dnf install -y libbpf-devel

Ubuntu/Debian:

# Install LLVM/Clang
apt update
apt install -y clang llvm libelf-dev

# Install kernel headers
apt install -y linux-headers-$(uname -r)

# Install BCC
apt install -y bpfcc-tools python3-bpfcc

# Install bpftool
apt install -y linux-tools-common linux-tools-$(uname -r)

# Install libbpf
apt install -y libbpf-dev

Verify Installation:

# Check clang version (10+ recommended)
clang --version

# Verify bpftool
bpftool version

# Test BCC installation
python3 -c "import bcc; print('BCC installed successfully')"

Advanced Configuration

Hello World eBPF Program (BCC)

Simple tracing example:

#!/usr/bin/env python3
# hello_world.py - Trace execve syscall

from bcc import BPF

# BPF program
program = """
#include <uapi/linux/ptrace.h>

int trace_execve(struct pt_regs *ctx) {
    char comm[16];
    bpf_get_current_comm(&comm, sizeof(comm));

    bpf_trace_printk("Process %s called execve\\n", comm);
    return 0;
}
"""

# Load BPF program
b = BPF(text=program)

# Attach to execve syscall
b.attach_kprobe(event=b.get_syscall_fnname("execve"), fn_name="trace_execve")

print("Tracing execve syscall... Press Ctrl-C to exit")

# Read trace pipe
try:
    b.trace_print()
except KeyboardInterrupt:
    print("Exiting")

Run:

chmod +x hello_world.py
sudo ./hello_world.py

XDP Packet Filtering

High-performance packet filter:

// xdp_filter.c - Drop packets from specific IP

#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/in.h>
#include <bpf/bpf_helpers.h>

#define BLOCKED_IP 0x0A000001  // 10.0.0.1 in network byte order

SEC("xdp")
int xdp_filter_prog(struct xdp_md *ctx) {
    void *data_end = (void *)(long)ctx->data_end;
    void *data = (void *)(long)ctx->data;

    // Parse Ethernet header
    struct ethhdr *eth = data;
    if ((void *)(eth + 1) > data_end)
        return XDP_PASS;

    // Check if IPv4
    if (eth->h_proto != __constant_htons(ETH_P_IP))
        return XDP_PASS;

    // Parse IP header
    struct iphdr *ip = (struct iphdr *)(eth + 1);
    if ((void *)(ip + 1) > data_end)
        return XDP_PASS;

    // Drop packets from blocked IP
    if (ip->saddr == BLOCKED_IP) {
        return XDP_DROP;
    }

    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

Compile and load:

# Compile to BPF bytecode
clang -O2 -target bpf -c xdp_filter.c -o xdp_filter.o

# Load XDP program
sudo ip link set dev eth0 xdp obj xdp_filter.o sec xdp

# Verify loaded
sudo ip link show dev eth0

# Remove XDP program
sudo ip link set dev eth0 xdp off

TCP Connection Tracking

Monitor TCP connections:

#!/usr/bin/env python3
# tcp_tracker.py - Track TCP connections

from bcc import BPF
from socket import inet_ntop, AF_INET
import struct

# BPF program
program = """
#include <uapi/linux/ptrace.h>
#include <net/sock.h>
#include <bcc/proto.h>

// Data structure for events
struct conn_event {
    u32 saddr;
    u32 daddr;
    u16 sport;
    u16 dport;
    u32 state;
};

BPF_PERF_OUTPUT(events);

int trace_tcp_state_change(struct pt_regs *ctx, struct sock *sk, int oldstate, int newstate) {
    if (newstate != TCP_ESTABLISHED && newstate != TCP_CLOSE)
        return 0;

    struct conn_event event = {};

    // Extract connection information
    u16 family = sk->__sk_common.skc_family;
    if (family != AF_INET)
        return 0;

    event.saddr = sk->__sk_common.skc_rcv_saddr;
    event.daddr = sk->__sk_common.skc_daddr;
    event.sport = sk->__sk_common.skc_num;
    event.dport = sk->__sk_common.skc_dport;
    event.state = newstate;

    events.perf_submit(ctx, &event, sizeof(event));
    return 0;
}
"""

# Load BPF program
b = BPF(text=program)

# Attach to TCP state change function
b.attach_kprobe(event="tcp_set_state", fn_name="trace_tcp_state_change")

# Event handler
def print_event(cpu, data, size):
    event = b["events"].event(data)

    saddr = inet_ntop(AF_INET, struct.pack("I", event.saddr))
    daddr = inet_ntop(AF_INET, struct.pack("I", event.daddr))
    sport = event.sport
    dport = event.dport
    state = "ESTABLISHED" if event.state == 1 else "CLOSE"

    print(f"{saddr}:{sport} -> {daddr}:{dport} [{state}]")

# Open perf buffer
b["events"].open_perf_buffer(print_event)

print("Tracking TCP connections... Press Ctrl-C to exit")

try:
    while True:
        b.perf_buffer_poll()
except KeyboardInterrupt:
    print("Exiting")

File Access Monitoring

Security monitoring example:

#!/usr/bin/env python3
# file_monitor.py - Monitor file access

from bcc import BPF

program = """
#include <uapi/linux/ptrace.h>
#include <linux/fs.h>

struct file_info {
    u32 pid;
    char comm[16];
    char filename[256];
};

BPF_PERF_OUTPUT(events);

int trace_open(struct pt_regs *ctx, struct file *file) {
    struct file_info info = {};

    info.pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&info.comm, sizeof(info.comm));

    // Get filename
    struct dentry *dentry = file->f_path.dentry;
    bpf_probe_read_kernel_str(&info.filename, sizeof(info.filename),
                              (void *)dentry->d_name.name);

    // Filter: only monitor /etc/ directory
    if (info.filename[0] == 'e' &&
        info.filename[1] == 't' &&
        info.filename[2] == 'c' &&
        info.filename[3] == '/') {
        events.perf_submit(ctx, &info, sizeof(info));
    }

    return 0;
}
"""

b = BPF(text=program)
b.attach_kprobe(event="security_file_open", fn_name="trace_open")

def print_event(cpu, data, size):
    event = b["events"].event(data)
    print(f"PID {event.pid} ({event.comm.decode()}) opened: {event.filename.decode()}")

b["events"].open_perf_buffer(print_event)

print("Monitoring /etc/ directory access... Press Ctrl-C to exit")

try:
    while True:
        b.perf_buffer_poll()
except KeyboardInterrupt:
    print("Exiting")

Performance Monitoring

CPU profiling with eBPF:

#!/usr/bin/env python3
# cpu_profile.py - CPU profiler using eBPF

from bcc import BPF
from time import sleep

program = """
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

struct key_t {
    char name[16];
    int stack_id;
};

BPF_HASH(counts, struct key_t);
BPF_STACK_TRACE(stack_traces, 10000);

int on_cpu_sample(struct bpf_perf_event_data *ctx) {
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    struct key_t key = {};

    bpf_get_current_comm(&key.name, sizeof(key.name));
    key.stack_id = stack_traces.get_stackid(&ctx->regs, BPF_F_USER_STACK);

    counts.increment(key);
    return 0;
}
"""

b = BPF(text=program)
b.attach_perf_event(ev_type=PerfType.SOFTWARE,
                    ev_config=PerfSWConfig.CPU_CLOCK,
                    fn_name="on_cpu_sample",
                    sample_period=0,
                    sample_freq=49)

print("Profiling CPU usage for 10 seconds...")
sleep(10)

# Print results
counts = b.get_table("counts")
stack_traces = b.get_table("stack_traces")

print("\nTop CPU consumers:")
for k, v in sorted(counts.items(), key=lambda kv: kv[1].value, reverse=True)[:20]:
    print(f"{k.name.decode():16s} {v.value:10d}")
    if k.stack_id >= 0:
        stack = list(stack_traces.walk(k.stack_id))
        for addr in stack:
            print(f"    {b.sym(addr, -1).decode()}")

Performance Optimization

Map Optimization

Choose appropriate map types:

// Per-CPU map for high-frequency updates (avoids locks)
struct {
    __uint(type, BPF_MAP_TYPE_PERCPU_HASH);
    __uint(max_entries, 10000);
    __type(key, u32);
    __type(value, u64);
} stats SEC(".maps");

// LRU map for automatic eviction
struct {
    __uint(type, BPF_MAP_TYPE_LRU_HASH);
    __uint(max_entries, 10000);
    __type(key, u32);
    __type(value, struct data);
} lru_cache SEC(".maps");

// Array for fixed-size indexed data
struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, 256);
    __type(key, u32);
    __type(value, u64);
} counters SEC(".maps");

JIT Compilation

Ensure JIT is enabled:

# Enable BPF JIT
echo 1 > /proc/sys/net/core/bpf_jit_enable

# Enable BPF JIT hardening (slight performance cost)
echo 2 > /proc/sys/net/core/bpf_jit_harden

# View JIT statistics
cat /proc/sys/net/core/bpf_jit_enable

Helper Function Selection

Use efficient helper functions:

// Fast: use bpf_ktime_get_ns() for timestamps
u64 timestamp = bpf_ktime_get_ns();

// Slow: avoid string operations in hot paths
// bpf_probe_read_str() can be expensive

// Fast: use per-CPU maps for statistics
struct data *d = bpf_map_lookup_elem(&percpu_map, &key);
if (d) {
    d->counter++;
}

Batch Operations

Update maps efficiently:

# Inefficient: update one by one
for i in range(1000):
    b["map"][ct.c_uint(i)] = ct.c_uint(i * 2)

# Efficient: batch update
keys = (ct.c_uint * 1000)(*range(1000))
values = (ct.c_uint * 1000)(*[i * 2 for i in range(1000)])
b["map"].items_update_batch(keys, values)

Monitoring and Observability

BPF Program Inspection

# List loaded BPF programs
bpftool prog show

# Show program details
bpftool prog show id 42

# Dump program bytecode
bpftool prog dump xlated id 42

# Dump program JIT code
bpftool prog dump jited id 42

# Pin program to filesystem
bpftool prog pin id 42 /sys/fs/bpf/my_prog

# Load pinned program
bpftool prog load pinned /sys/fs/bpf/my_prog

Map Inspection

# List BPF maps
bpftool map show

# Show map contents
bpftool map dump id 10

# Update map entry
bpftool map update id 10 key 0x0a 0x00 0x00 0x01 value 0x01 0x00 0x00 0x00

# Delete map entry
bpftool map delete id 10 key 0x0a 0x00 0x00 0x01

# Pin map to filesystem
bpftool map pin id 10 /sys/fs/bpf/my_map

Performance Monitoring

# Monitor BPF program run time
bpftool prog profile id 42 duration 30

# Show BPF statistics
cat /proc/sys/kernel/bpf_stats_enabled
echo 1 > /proc/sys/kernel/bpf_stats_enabled
bpftool prog show

Troubleshooting

Verifier Rejection

Symptom: BPF program rejected by verifier.

Diagnosis:

# View verifier log
dmesg | tail -100

# Load with increased log level
bpftool prog load program.o /sys/fs/bpf/prog log_level 2

# Common issues:
# - Unbounded loops
# - Invalid memory access
# - Stack overflow
# - Uninitialized variables

Resolution:

// Bad: unbounded loop
for (int i = 0; i < n; i++) { }  // Verifier rejects

// Good: bounded loop
#define MAX_ITER 100
for (int i = 0; i < MAX_ITER && i < n; i++) { }

// Bad: potential NULL dereference
struct data *d = bpf_map_lookup_elem(&map, &key);
d->field = value;  // Verifier rejects

// Good: NULL check
struct data *d = bpf_map_lookup_elem(&map, &key);
if (!d)
    return 0;
d->field = value;

Performance Issues

Symptom: High CPU usage or latency from BPF programs.

Diagnosis:

# Profile BPF program
bpftool prog profile id 42 duration 30

# Check program complexity
bpftool prog show id 42

# Monitor map operations
bpftool map show id 10

Resolution:

# Optimize map type (use per-CPU maps)
# Reduce program complexity
# Use tail calls for complex logic
# Enable JIT compilation
echo 1 > /proc/sys/net/core/bpf_jit_enable

XDP Not Working

Symptom: XDP program not processing packets.

Diagnosis:

# Check XDP attachment
ip link show dev eth0

# Verify driver support (native mode)
ethtool -i eth0

# Check for errors
dmesg | grep -i xdp

# Test in generic mode
ip link set dev eth0 xdpgeneric obj program.o sec xdp

Resolution:

# Use supported driver or generic mode
# Check for conflicting XDP programs
ip link set dev eth0 xdp off

# Reload program
ip link set dev eth0 xdp obj program.o sec xdp verbose

Conclusion

eBPF represents a paradigm shift in Linux kernel programming, observability, and security. By enabling safe, verified programs to run within the kernel, eBPF unlocks capabilities previously impossible without risky kernel modifications—dynamic tracing, high-performance packet processing, security policy enforcement, and comprehensive system observability.

Organizations leveraging eBPF gain unprecedented visibility into system behavior, network performance, and security events while maintaining kernel stability and security. The verifier ensures program safety, preventing crashes and security vulnerabilities that plague traditional kernel module approaches.

Successful eBPF adoption requires understanding kernel architecture, networking fundamentals, and performance optimization techniques. While tools like BCC simplify eBPF development, advanced use cases demand deeper knowledge of kernel internals, map optimization, and helper function selection.

As cloud-native computing, Kubernetes, and service mesh architectures become ubiquitous, eBPF's importance continues growing. Projects like Cilium, Falco, and Pixie demonstrate eBPF's transformative potential for container networking, security, and observability. Engineers mastering eBPF position themselves at the forefront of Linux innovation, building next-generation infrastructure platforms leveraging this revolutionary technology.