2026-05-24
Imagine you ask a contractor to fix a leaky faucet. They fix it — but they also rearrange your kitchen cabinets, repaint the bathroom, and replace the light fixtures in the hallway. Sure, some of that might be improvements, but now you can't easily tell whether the original leak is actually fixed, and your house looks different in ways you never asked for. This paper is about the software-engineering version of that problem.
The researchers studied coding agents — AI systems that take a bug report or feature request and try to fix the codebase automatically. They looked at 3,691 real patches from Multi-SWE-bench, a benchmark that tests these agents on actual GitHub issues. What they found is a pattern they call "refactoring runaway": the agent does the requested fix, but also throws in a bunch of unrelated code reorganization — renaming things, restructuring functions, tidying up code that had nothing to do with the bug.
Why does this happen? Two reasons:
The problem isn't that refactoring is bad. It's that tangled refactoring makes pull requests hard to review, hard to revert, and risky to merge. A 50-line change that includes the actual bug fix plus 200 lines of unrelated cleanup is far more dangerous than two separate, focused changes. Reviewers can't tell what's load-bearing.
The paper categorizes the kinds of tangled refactorings that show up most often, measures how widespread the problem is, and proposes mitigation strategies — essentially teaching agents to recognize when they're drifting off-task and to keep their changes minimal and focused on the original issue.
The key insight is uncomfortable but important: making AI coders behave more like humans isn't always a good thing. Humans have bad habits too, and an agent that faithfully imitates those habits inherits them at scale. "Just fix the bug" turns out to be a non-trivial discipline to instill.
