METR finds AI slows veteran developers
METR’s randomized controlled trial tracked 16 experienced open-source maintainers across 246 real tasks and found that early-2025 AI coding tools made them 19% slower, not faster. The result is striking because developers expected a 24% speedup and mostly used frontier tools at the time, especially Cursor Pro with Claude 3.5 and 3.7 Sonnet.
This is one of the clearest reality checks yet on AI coding hype: benchmark wins and subjective “it feels faster” reports do not automatically translate into productivity gains inside mature codebases. The bigger story is not just slowdown, but how confidently experienced developers misread their own throughput while using these tools.
- –The study focuses on a hard, high-signal setting: veteran contributors working in repositories they know deeply, where AI has less room to outperform hard-won human context.
- –It directly challenges the industry habit of treating anecdotal speedups and benchmark scores as interchangeable with real-world engineering output.
- –The paper is careful not to overclaim; it does not say AI is broadly useless, only that in this specific early-2025 setup it slowed senior open-source developers down.
- –Public discussion around the paper has zeroed in on learning-curve effects, since many participants had limited prior Cursor experience, which makes this result damning for current workflows but not necessarily final for future ones.
- –For AI tool builders, the implication is brutal: reducing prompt overhead, review burden, and context friction matters more than flashy benchmark demos if the goal is real developer acceleration.
DISCOVERED
37d ago
2026-03-06
PUBLISHED
37d ago
2026-03-06
RELEVANCE
AUTHOR
DIY Smart Code