bpftrace: DTrace for Linux, and the One-Liners That Pay for Themselves

2026-06-08

For thirty years, every time I wanted to know what the kernel was actually doing, I had bad options. strace stops the world with ptrace. perf samples but doesn't tell you why. SystemTap compiles a kernel module and made me cry in 2009. bpftrace finally fixes it: an awk-shaped language that compiles to eBPF, runs in the kernel at native speed, and aggregates results in-kernel so you don't drown in events.

The one-liner most engineers learn first — and the one that has saved me a dozen late nights — is the system-wide openat() tap:

bpftrace -e '
  tracepoint:syscalls:sys_enter_openat {
    printf("%-16s %s\n", comm, str(args->filename));
  }'

Every process. Every file open. No attaching, no PID, no recompile. The overhead is maybe 1% on a busy box. Try doing that with strace -f from PID 1 and watch your machine catch fire.

The real party trick is in-kernel aggregation. Want a histogram of read latencies, bucketed in log2, in microseconds, for every block device, live?

bpftrace -e '
  kprobe:vfs_read { @start[tid] = nsecs; }
  kretprobe:vfs_read /@start[tid]/ {
    @us = hist((nsecs - @start[tid]) / 1000);
    delete(@start[tid]);
  }'

Hit Ctrl-C and you get an ASCII histogram. No log file rotation, no awk post-processing, no missed events because your userspace was paged out.

A few more I keep in my back pocket:

List every probe point your kernel exposes — there are tens of thousands — with bpftrace -l 'tracepoint:*' or bpftrace -l 'kprobe:tcp_*'. The probe namespace alone is an education: uprobe, uretprobe, and usdt let you hook userspace symbols and USDT markers (PostgreSQL, Python, OpenJDK all ship them).

Why this beats the mainstream tools:

The catches: needs root (or CAP_BPF+CAP_PERFMON on 5.8+), needs a kernel with BTF for kprobe argument access by name (most distros ship it now), and you can hang yourself with an unbounded map. Use hist(), lhist(), and count() — don't printf a million events a second unless you want to watch a tree fall in a forest you can't observe.

Once it clicks, you stop reaching for strace entirely for production diagnostics. The Brendan Gregg book and the bpftrace reference guide on GitHub are the only docs you need.

Key Takeaway: bpftrace turns the Linux kernel into a queryable surface — write a one-line script, get an in-kernel aggregated answer with ~1% overhead, and finally stop guessing what the machine is doing.

All newsletters