2026-05-08
You wrote Verilog. The synthesizer turned it into a netlist of gates. Now comes the part most software engineers never think about: those gates need physical coordinates on a silicon die. Place-and-route (P&R) is the geometry problem that decides whether your chip hits timing or melts in the fab.
The three-stage flow:
Why this is hard: wires have RC delay. A 2mm wire on M3 can add 200ps of delay — more than the gates it connects. If the placer puts a flip-flop's source and destination on opposite corners of the die, no amount of buffer insertion saves you. Modern tools do timing-driven placement: critical paths get pulled tight, non-critical paths get pushed aside to relieve congestion.
Real example: Apple's M-series chips have CPU cores placed adjacent to their L2 cache because L2 access is single-cycle at 3+ GHz. One nanosecond is roughly 15cm of light travel in vacuum, but only ~5–10mm of signal travel through silicon interconnect. At 3.2 GHz (312ps period), a cache that's 4mm away costs you a cycle just in wire flight time, before any gate switches.
Rule of thumb — the 70% utilization rule: aim for 60–75% cell-area utilization in each placement region. Below 60% you're wasting silicon. Above 75% the router runs out of space to weave wires between cells, and you'll see routing congestion hotspots — DRC violations the tool can't fix without ripping up placement. A 10mm² block with 65% utilization gives the router breathing room for clock trees and power straps.
The congestion vs timing tension: spreading cells out reduces congestion but increases wirelength (worse timing). Packing them tight improves timing locally but creates routing hotspots. Floorplanners obsess over this tradeoff via congestion maps — heat maps showing predicted wire density per region. Red zones get manually fixed: add channels, move macros, or partition the block.
