Tool Nobody Knows: vmtouch: The Page Cache Whisperer

vmtouch: The Page Cache Whisperer

2026-05-18

You benchmark a query. First run: 4.2 seconds. Second run: 80ms. You shrug and call it "caching." But what is cached, how much, and can you control it? The Linux page cache is a massive lever on performance that almost nobody touches directly. vmtouch by Doug Hoyte is the surgical instrument for it.

Mainstream advice tells you to drop caches with echo 3 > /proc/sys/vm/drop_caches (nuclear, requires root, clears everything) or just "run it twice." vmtouch lets you inspect, load, evict, and lock specific files — per-page, no root needed for most operations.

Inspect what's resident:

$ vmtouch /var/lib/postgresql/data/base
       Files: 1247
  Directories: 38
  Resident Pages: 89234/412908  348M/1.5G  21.6%
       Elapsed: 0.18 seconds

21% of your Postgres data lives in RAM right now. Add -v for a per-file map showing which pages are hot (it draws an ASCII bar of resident vs. cold pages — genuinely useful for spotting which indexes get touched).

Pre-warm the cache before a benchmark or deploy:

$ vmtouch -t ./big-lookup-table.idx
       Files: 1
  Resident Pages: 524288/524288  2G/2G  100%
       Elapsed: 1.4 seconds

No more "first request is slow" surprises after a restart. Pair it with a systemd unit and your service comes up hot.

Evict specific files to simulate cold-start without nuking the whole machine:

$ vmtouch -e ./test-data.parquet
$ hyperfine './my-query test-data.parquet'

You can now measure honest cold-cache numbers on a shared box without affecting anyone else's working set. This is the killer feature — drop_caches is a sledgehammer, vmtouch -e is tweezers.

Lock files in memory so they're never evicted:

# Pin the entire SQLite DB into RAM as a daemon
$ vmtouch -dl /var/db/hot.sqlite
       Files: 1
  Resident Pages: 131072/131072  512M/512M  100%
   Locked Pages: 131072/131072  512M/512M  100%
   Daemonized: PID 8421

Now the kernel can't evict those pages under memory pressure. For latency-sensitive read paths where you can spare the RAM, this beats any application-level cache because it's free — the kernel still serves reads at memory speed via the normal pread/mmap path.

One more trick: directory trees and globbing work. Want to know how much of your Git object store is in RAM after a build?

$ vmtouch .git/objects
  Resident Pages: 4831/28104  18M/109M  17.1%

Or load every .so a binary will dlopen:

$ ldd ./myapp | awk '{print $3}' | grep -v '^$' | xargs vmtouch -t

Install: apt install vmtouch, brew install vmtouch, or grab the single ~600-line C file from hoytech.com — it's been stable for over a decade. It uses mincore(2) for inspection, posix_fadvise/mmap+mlock for loading and locking, and POSIX_FADV_DONTNEED for eviction. No kernel module, no root, no magic.

The day you realize page-cache state is something you can see and control, half of your "why is this slow sometimes?" mysteries dissolve.

Key Takeaway: The Linux page cache is a programmable resource, not a black box — vmtouch lets you inspect, warm, evict, and pin individual files so your benchmarks tell the truth and your hot paths stay hot.

All newsletters