2026-06-01
Cache coherence sounds elegant until you count the wires. With 64 cores all snooping every memory transaction, you'd burn most of your interconnect bandwidth just on coherence chatter — most of which is pointless, because the line isn't cached anywhere relevant. The snoop filter is the hardware that makes coherence scale: a directory-like structure that tracks which cores might hold a given cache line, so the system only bothers the cores that actually matter.
Think of it as a coarse-grained directory sitting at the L3 or the home agent. When a core issues a read-for-ownership, the snoop filter is consulted first. If no other core holds the line, the request goes straight to memory or L3 — no broadcast, no waiting on dozens of cores to respond "nope." If the filter says cores 3, 17, and 42 might have it, only those three get snooped.
Implementation flavors:
Concrete example — Intel Skylake-SP / Ice Lake-SP: Intel moved from inclusive L3 (where L3 was the snoop filter implicitly) to non-inclusive L3 plus an explicit snoop filter. Why? L3 got smaller per core and L2 got bigger (1MB), so an inclusive L3 would have wasted capacity duplicating L2 contents. The snoop filter is sized to cover the aggregate L2 footprint — when a line gets evicted from the snoop filter, every core caching it must also evict it (a "back-invalidation"), which is a real and measurable performance pothole.
Rule of thumb: snoop filter capacity ≈ sum of private cache capacities × (1.0 to 1.5). On a 28-core Xeon with 1MB L2 per core, that's ~28-42MB of tracking metadata at line granularity. At ~10 bits per entry (presence vector + state), that's millions of entries — a significant on-die structure.
The performance failure mode: a workload that touches more cache lines across cores than the snoop filter can track triggers back-invalidations. Code that runs fine on 16 cores can fall off a cliff at 32 because the working set blew past the filter. You'll see it in performance counters as SF_EVICTION events on Intel — a signal that's invisible to most profilers but devastating to scaling.
