2026-05-29
Chips fail in a pattern engineers call the bathtub curve: high failure rate at the start (infant mortality), a long flat middle (random failures), and a steep climb at the end (wearout). The flat middle is where you want your customer to live. The infant mortality cliff at the start is where weak chips — ones with marginal oxide, contamination, or near-defect metal — die in the first few hundred hours. Burn-in is the manufacturing step that finds and kills those weak chips before they ship.
The trick is acceleration. Failure mechanisms like oxide breakdown and electromigration follow Arrhenius behavior — they're exponentially faster at higher temperature and voltage. So you stuff thousands of chips into a giant oven at 125°C, crank the supply voltage 10–20% above nominal (say 1.32V instead of 1.1V), exercise them with pattern generators for 24–168 hours, then test again. Anything that died is a weak part you didn't want in the field anyway.
The math uses the Arrhenius equation with an activation energy Ea (typically 0.7 eV for oxide-related failures):
Rule of thumb: for Ea = 0.7 eV, every 10°C of added stress roughly doubles the aging rate. Going from 55°C use to 125°C stress at 0.7 eV gives an acceleration factor around 80–100×. Add voltage stress (each 10% over-voltage gives another 3–10× via the E-model) and 48 hours of burn-in can simulate a year or more of customer use.
Real-world example: Intel and AMD burn-in every server CPU because a $10,000 Xeon failing in a customer's datacenter costs far more than the burn-in chamber's electricity. Automotive chips (AEC-Q100 Grade 0) face mandatory burn-in at 150°C for hundreds of hours — your car's airbag controller cannot have a 1-in-1000 infant mortality rate. By contrast, cheap consumer parts often skip burn-in and rely on field returns instead; that's why your $2 USB charger sometimes dies in week one.
Burn-in isn't free: chambers cost millions, sockets wear out, and every hour at 125°C eats into the chip's remaining life. So engineers tune the recipe to kill weak units without significantly aging the survivors — typically targeting under 1% of total useful life consumed during burn-in.
