2026-06-10
You've seen Cache Allocation Technology partition L3 cache between workloads. But cache isn't the only shared resource — memory bandwidth is finite too, and one greedy core streaming gigabytes per second can starve every other core on the socket. Memory Bandwidth Allocation (MBA), part of Intel's Resource Director Technology (RDT), lets you cap how much DRAM bandwidth a core or cgroup can consume.
The mechanism is surprising: MBA doesn't measure bytes/second directly. Instead, it injects delay cycles between memory requests issued by a throttled core. The MSR IA32_L2_QOS_Ext_BW_Thrtl_n holds a value 0–90 (in 10% increments on most parts), where 0 means "no throttling" and 90 means "delay aggressively." The CPU doesn't know your bandwidth target — it just slows your request rate proportionally.
This indirection causes the classic MBA gotcha: throttling is relative to the workload, not absolute. A core throttled to "50%" doesn't get 50% of socket bandwidth — it gets ~50% of the bandwidth it would have achieved unthrottled. A core that only issues memory requests occasionally feels no throttling at all even at value 90, while a streaming workload gets crushed.
Real-world example: Cloud providers running mixed workloads use MBA to protect latency-sensitive services. Picture a database (p99 latency matters) co-located with a batch analytics job on the same socket. The analytics job memcpy's through 200GB/s of data, saturating the memory controller. The database's working set fits in cache, but cache-line fills for its rare misses now wait behind 50 streaming requests — p99 latency jumps from 200µs to 5ms. Putting the analytics cgroup in an MBA class with throttle value 60 cuts its bandwidth usage to ~40%, and database p99 drops back under 500µs. The analytics job runs ~30% slower; the database stays fast.
Rule of thumb for sizing: MBA throttle values are coarse (10% steps). Start with value 30 (light throttling) for noisy neighbors and increase by 20 until your latency-sensitive workload meets SLO. Measure with Memory Bandwidth Monitoring (MBM) MSRs — never trust the throttle value as a bandwidth percentage. Effective bandwidth = unthrottled_bw × (1 − throttle/100) only when the workload is memory-bound; otherwise the cap is loose.
Configure via resctrl: echo "MB:0=60;1=60" > /sys/fs/resctrl/noisy/schemata caps the "noisy" group to 60% on both sockets. Combine with CAT (L3:0=0xff;1=0xff) for two-axis isolation — cache and bandwidth, the two resources that actually matter at the socket level.
