2026-05-01
You already know a system call requires a privilege transition: userspace saves registers, executes syscall/svc, the CPU switches to ring 0, the kernel does work, then returns. That transition costs roughly 100-150 nanoseconds on modern x86-64. For calls made millions of times per second—like gettimeofday() or clock_gettime()—this overhead is devastating. The vDSO (virtual Dynamic Shared Object) eliminates it entirely.
The vDSO is a small ELF shared library that the kernel maps into every process's address space automatically. You never link against it; the kernel injects it at process creation. The dynamic linker discovers it and routes certain libc calls through it. The key insight: the kernel maps a read-only page of kernel-maintained data (clock values, CPU frequency, etc.) into userspace. The vDSO code reads this shared page directly—no privilege transition needed.
You can see it yourself:
cat /proc/self/maps | grep vdso — shows the mapped region (typically one 4KB page)getconf GNU_LIBC_VERSION then ldd --version — your glibc routes through vDSO automaticallyauxval AT_SYSINFO_EHDR — the ELF header address passed via the auxiliary vector at exec timeOn x86-64 Linux, the vDSO typically accelerates four calls: clock_gettime(), gettimeofday(), time(), and getcpu(). On ARM64, clock_getres() is also included. These all share one trait: they read kernel state but never modify it, making the shared-page approach safe without synchronization.
The kernel updates the vDSO data page on every timer tick. The vDSO code uses a seqlock pattern: read a sequence counter, read the data, re-read the counter. If the counter changed, retry. This gives lock-free, wait-free reads with no kernel entry.
Real-world impact: A high-frequency trading application calling clock_gettime() in a tight loop sees roughly 15ns via vDSO versus 120ns via true syscall—an 8x speedup. At 10 million timestamp reads per second, that saves a full CPU core's worth of kernel time.
Rule of thumb: if you're benchmarking syscall overhead and gettimeofday() looks suspiciously fast (under 25ns), you're measuring the vDSO, not a real kernel transition. Use an actually-entering-kernel call like getpid() for true syscall cost measurement.
The older vsyscall page (fixed address 0xffffffffff600000) was the predecessor. It was deprecated because its fixed address made it an easy ROP gadget. The vDSO uses ASLR—its address is randomized per process, eliminating that attack surface. Modern kernels emulate vsyscall as a slow trap for backward compatibility only.
To extract and inspect the vDSO: dd if=/proc/self/mem bs=1 skip=$((0x...)) count=4096 of=vdso.so (using the address from /proc/self/maps), then objdump -d vdso.so. You'll see tight, hand-optimized assembly for each accelerated call.
