OPEN_SOURCE ↗
YT · YOUTUBE// 18d agoRESEARCH PAPER
Agentic AI Paper Shows 43-Point Gap
This paper reviews controlled trials and independent validations across software engineering, clinical documentation, and clinical decision support. Its core point is that agentic AI's promised productivity gains often vanish in real workflows: experienced open-source developers expected a 24% speedup but actually took 19% longer.
// ANALYSIS
The uncomfortable takeaway is that AI productivity claims keep collapsing when the whole workflow is measured, not just the model output. A 43-point gap is not a rounding error; it's a calibration failure.
- –In the coding study, the time sink is prompting, waiting, and verification, which can erase any raw generation advantage.
- –The clinical documentation examples show this is not just a coding problem: vendor claims still outrun independent measurements.
- –Mature codebases and experienced engineers are the hardest test case, because tacit context and established workflows blunt the advantage AI is supposed to provide.
- –Teams should measure end-to-end cycle time, quality, and review burden, not just suggestion acceptance or self-reported speedups.
- –The real lesson is calibration: use AI where it genuinely shortens work, but don't assume the demo gain survives contact with production constraints.
// TAGS
quantifying-the-expectation-realisation-gap-for-agentic-ai-systemsresearchai-codingagent
DISCOVERED
18d ago
2026-03-24
PUBLISHED
18d ago
2026-03-24
RELEVANCE
9/ 10
AUTHOR
DIY Smart Code