2026-05-06
When you read() a file, the kernel copies bytes from the page cache into your buffer. mmap() skips that copy: it maps file pages directly into your address space, so loads from your pointer are loads from the page cache itself. The first access to each page triggers a major page fault if the page isn't cached, or a minor fault (just a page-table fixup) if it is.
How the mapping works. mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0) reserves len bytes of virtual address space and installs PTEs that point at the file's pages in the page cache. No I/O happens yet. Walk the pointer, take a fault, the kernel populates the PTE. Dirty MAP_SHARED pages get written back by the flusher threads or on msync().
When mmap wins:
lseek+read dance, just pointer arithmetic.When read() wins:
read() with a small buffer fits in L2 and avoids TLB pressure.SIGBUS, not EOF. Robust code installs a handler or uses read().Real example: ripgrep. For files larger than ~64 KB on disk, ripgrep mmaps them and runs SIMD pattern matching directly on the mapping. Below that threshold, the syscall+page-fault overhead exceeds the cost of a single buffered read(), so it falls back. The crossover point is empirical — measure on your hardware.
Rule of thumb. Page faults cost ~1–3 µs each (minor) or 10–100 µs (major, with disk). On 4 KiB pages, a 1 GiB file = 262,144 pages. If every page faults once: 0.25–0.75 seconds of fault overhead alone. Use madvise(MADV_SEQUENTIAL) to enable readahead and MADV_WILLNEED to prefault pages you know you'll touch.
Gotcha: MAP_POPULATE prefaults at mmap() time, eliminating per-page faults but blocking the call. Great for latency-sensitive paths after startup; terrible if you only touch 1% of the file.
