2026-04-25
Every time your userspace code needs the kernel to do something — open a file, allocate memory, send a packet — it must cross the privilege boundary via a system call. Understanding this mechanism is essential because syscalls are the single most expensive "function call" your program makes regularly.
On x86-64 Linux, the modern syscall path uses the syscall instruction (replacing the older int 0x80 trap). The convention is precise:
On ARM64 (AArch64), the equivalent is the svc #0 instruction, with the syscall number in X8 and arguments in X0–X5.
Here's a minimal x86-64 write syscall in inline assembly — no libc involved:
const char msg[] = "hello\n";
long ret;
asm volatile(
"syscall"
: "=a"(ret)
: "a"(1), // syscall number: write
"D"(1), // fd: stdout
"S"(msg), // buffer
"d"(sizeof(msg)-1) // count
: "rcx", "r11", "memory"
);
Real-world impact: A raw syscall on modern x86-64 hardware takes roughly 50–100 nanoseconds of overhead (mode switch, KPTI page table swap, speculative execution mitigations). Compare that to a normal function call at ~1–2 ns. This means a tight loop doing one syscall per iteration is 50–100x slower than one doing purely userspace work. This is exactly why high-performance code uses techniques like io_uring (batch submissions), mmap (avoid read/write syscalls entirely), and the vDSO — a kernel-mapped shared library that implements certain syscalls like gettimeofday and clock_gettime entirely in userspace, eliminating the mode switch.
Rule of thumb: if your hot path makes more than ~10,000 syscalls per second, the kernel boundary overhead itself starts consuming a meaningful percentage of a single core. At 100 ns each, 10,000 syscalls burn 1 ms per second — about 0.1% of a core. At 1,000,000 syscalls/sec, that's 100 ms/sec — 10% of a core lost purely to mode switching.
You can trace your program's syscalls with strace -c ./program to get a summary showing call counts and cumulative time. This is often the fastest way to diagnose unexpected I/O or permission issues, and to find syscalls worth batching or eliminating.
