The IOMMU: Virtual Memory for Devices

2026-05-23

You already know the MMU translates virtual addresses to physical for your CPU. The IOMMU (Intel VT-d, AMD-Vi, ARM SMMU) does the same thing for devices. Without it, a PCIe device performing DMA writes raw physical addresses to the memory bus — whatever address the driver programmed into its descriptor ring, the device hits directly.

That's dangerous for three reasons:

The IOMMU sits between the device and memory, walking page tables keyed by the device's BDF (Bus:Device.Function) identifier. Each device — or each IOMMU group of devices that share a translation context — gets its own page table. When the NIC issues a DMA write to "address 0x1000", the IOMMU translates that I/O virtual address (IOVA) to a real physical page, exactly like the MMU does for CPU loads.

Real-world example: VFIO passthrough. When you assign a GPU to a QEMU guest with vfio-pci, the kernel programs the IOMMU so the GPU's view of memory is the guest's physical address space. The guest driver writes guest-physical addresses into the GPU's command ring; the IOMMU translates those into host-physical pages backing the guest's RAM. The GPU literally cannot touch anything outside that mapping — a hardware-enforced sandbox.

The cost: Every DMA now requires an IOTLB lookup, and misses walk a 4-level page table just like the CPU's TLB. A 100Gbps NIC pushing 14M packets/sec, each touching a different page, can saturate the IOTLB. Mitigations:

Check your groups with find /sys/kernel/iommu_groups/ -type l. Devices in the same group must be passed through together — they share a translation context and can DMA to each other.

See it in action: Check out DMA Controller: How Peripheral Devices Transfer Data to RAM by BitLemon to see this theory applied.
Key Takeaway: The IOMMU gives every device its own virtual address space, turning DMA from a "trust the device" gamble into hardware-enforced isolation — at the cost of an IOTLB that you must design your buffer strategy around.

All newsletters