2026-06-09
On x86-64, the FS and GS segment registers don't do segmentation anymore — they exist solely to hold a 64-bit base address that gets added to addresses formed with an fs: or gs: prefix. The kernel uses GS to find per-CPU data; glibc uses FS to find your thread-local storage. The base is hidden state, not addressable like a general-purpose register.
Historically, the only way to change FS_BASE or GS_BASE was a syscall: arch_prctl(ARCH_SET_FS, addr). That's a ring transition, register save, and ~200 cycles — fine for thread creation, awful for anything that wants to swap a "context pointer" on a hot path. Userspace M:N threading libraries (Go runtime predecessors, fiber libraries, certain garbage collectors) all wanted cheap base swapping.
Ivy Bridge (2012) added four instructions enabled by setting CR4.FSGSBASE: RDFSBASE, RDGSBASE, WRFSBASE, WRGSBASE. They read or write the hidden base directly from ring 3 — no syscall, no ring transition. Linux only enabled the feature for userspace in kernel 5.9 (2020), because the kernel had to audit every code path that assumed FS_BASE could only change via the syscall (signal handlers, ptrace, core dumps all cache the value).
Concrete example — green-threading library: A fiber scheduler keeps each fiber's TLS block at a fixed offset. On context switch:
syscall arch_prctl → ~200 cycles + cache pollution from kernel entry.wrfsbase rax → ~15 cycles, single instruction.That's a 13× speedup on a path that runs millions of times per second in a busy fiber runtime. Go's runtime considered this but ultimately uses the GS register indirectly via the m→g pointer rather than swapping bases.
Rule of thumb: If you swap "thread context" more than ~10,000 times per second per core, FSGSBASE pays for itself. Below that, the syscall path's simplicity wins.
Check support: grep fsgsbase /proc/cpuinfo for hardware; on Linux ≥5.9 with nofsgsbase absent from the kernel cmdline, userspace can use it. Older kernels: WRFSBASE from ring 3 raises #UD (invalid opcode) because CR4.FSGSBASE stays clear.
The subtle trap: If you WRFSBASE in a function and a signal arrives, the kernel signal-delivery path saves your FS_BASE, runs the handler (which expects glibc's TLS base), then restores yours. If glibc's signal handler touches errno (which is TLS), and the kernel didn't restore before the handler ran — boom. This is why kernel 5.9 took eight years after the hardware.
