Daily Low-Level Programming: PCID: Why Modern CPUs Don't Flush the TLB on Every Context Switch

PCID: Why Modern CPUs Don't Flush the TLB on Every Context Switch

2026-06-02

Before 2010, switching address spaces on x86 was brutal: writing to CR3 flushed the entire TLB. Every context switch meant the new process started cold, paying a TLB-miss tax on every memory access until its working set was re-walked from the page tables. PCID (Process-Context Identifier) fixed this by tagging each TLB entry with a 12-bit ID, so entries from different address spaces can coexist.

The mechanism: when PCID is enabled (CR4.PCIDE=1), the low 12 bits of CR3 become the PCID, not part of the page-table address. A MOV to CR3 now means "switch to address space X with ID Y" — and entries tagged with other IDs survive untouched. Bit 63 of the CR3 value controls whether even the current PCID's entries get flushed: set it, and you get the old behavior; clear it, and the TLB persists.

The Meltdown wrinkle. When Linux deployed KPTI (Kernel Page-Table Isolation) in 2018, every syscall suddenly required two address-space switches — user→kernel→user. Without PCID, this would have meant two full TLB flushes per syscall. Linux instead uses two PCIDs per process: one for user-mode page tables, one for kernel-mode. The kernel sets bit 63 of CR3 to skip the flush, and the TLB entries for both views survive across the syscall boundary. On a Skylake without PCID, KPTI cost 30%+ on syscall-heavy workloads; with PCID, it dropped to roughly 5%.

Rule of thumb. A TLB miss costs ~20-100 cycles (a 4-level page walk, possibly with cache misses on each level). A modern process has a working set of hundreds to thousands of TLB entries. Without PCID, a context switch back to a previously-running process pays ~20,000-100,000 cycles repopulating the TLB. With PCID, near zero — as long as the entries weren't evicted in the meantime.

The limit. Only 4096 PCIDs exist. Linux maintains a per-CPU LRU map of ~6 PCIDs per CPU, recycling aggressively. When a process gets a fresh PCID assignment, you pay the cold-cache cost anyway. You can see this in perf stat -e dtlb_load_misses.miss_causes_a_walk spiking after a long sleep.

Real-world impact. Redis, PostgreSQL, and anything doing rapid syscalls saw measurable slowdowns post-Meltdown on pre-Haswell CPUs (no PCID support) — some shops literally replaced hardware to recover throughput. Check /proc/cpuinfo for the pcid flag; if missing, KPTI is expensive and you should consider nopti if your threat model permits.

Key Takeaway: PCID tags TLB entries with an address-space ID so context switches and KPTI's syscall transitions preserve TLB state instead of paying a cold-cache penalty every time.

All newsletters