2026-05-27
When your CPU issues a load to address 0xFEE00020 (the Local APIC) versus 0x7FFE1234 (your heap), it treats them completely differently. The APIC access bypasses every cache and serializes with prior stores; the heap access fills a cache line, may be prefetched, and can be reordered. The CPU doesn't guess — it looks up the memory type for that physical address through two cooperating mechanisms: MTRRs (Memory Type Range Registers) and PAT (Page Attribute Table).
The five memory types that matter:
MTRRs are a small set of MSRs (typically 8–10 variable-range pairs) set up by firmware that paint memory-type stripes onto physical address ranges at boot. They're coarse: power-of-two sizes, base-aligned. The BIOS uses them to mark, say, 0xFEE00000–0xFEE01000 as UC for the APIC, and the rest of RAM as WB.
PAT is the page-granular override. It's an MSR holding 8 memory-type slots, indexed by a 3-bit field built from the PWT, PCD, and PAT bits in your page table entry. When the CPU translates a virtual address, it pulls those three bits, indexes into the PAT MSR, gets a memory type, and combines it with the MTRR type using a fixed precedence table (UC always wins; WC + WB = UC; etc.).
Real-world example: when you mmap a GPU's BAR with /dev/mem or via a DRM ioctl, the kernel chooses a PAT slot configured as WC and sets the PTE bits accordingly. That's why memcpy into VRAM hits ~12 GB/s instead of ~200 MB/s — the WCB coalesces 64-byte bursts onto PCIe instead of one uncached store at a time. Get the type wrong and your driver works but is 60× slower.
Rule of thumb: if you see a performance cliff on a region of memory that "should be fast," check /sys/kernel/debug/x86/pat_memtype_list and cat /proc/mtrr. A region silently demoted from WB to UC because of an MTRR conflict is the classic cause.
