Agentic AI Paper Shows 43-Point Gap

// 64d agoRESEARCH PAPER

Agentic AI Paper Shows 43-Point Gap

This paper reviews controlled trials and independent validations across software engineering, clinical documentation, and clinical decision support. Its core point is that agentic AI's promised productivity gains often vanish in real workflows: experienced open-source developers expected a 24% speedup but actually took 19% longer.

// ANALYSIS

The uncomfortable takeaway is that AI productivity claims keep collapsing when the whole workflow is measured, not just the model output. A 43-point gap is not a rounding error; it's a calibration failure.

–In the coding study, the time sink is prompting, waiting, and verification, which can erase any raw generation advantage.
–The clinical documentation examples show this is not just a coding problem: vendor claims still outrun independent measurements.
–Mature codebases and experienced engineers are the hardest test case, because tacit context and established workflows blunt the advantage AI is supposed to provide.
–Teams should measure end-to-end cycle time, quality, and review burden, not just suggestion acceptance or self-reported speedups.
–The real lesson is calibration: use AI where it genuinely shortens work, but don't assume the demo gain survives contact with production constraints.

// TAGS

quantifying-the-expectation-realisation-gap-for-agentic-ai-systemsresearchai-codingagent

DISCOVERED

64d ago

2026-03-24

PUBLISHED

64d ago

2026-03-24

RELEVANCE

9/ 10

AUTHOR

DIY Smart Code

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS2h ago

Pangram flags Pope's encyclical as Claude-generated

Online sleuths claim Pope Leo's first encyclical, "Magnifica Humanitas," contains text generated by Claude. The Pangram AI detector flagged key paragraphs as 100% AI, supported by linguistic tells like excessive em-dashes and the word "genuinely."

MODEL3h ago

Prism ML launches Bonsai Image 4B variants

Prism ML has released Bonsai Image 4B, a compact text-to-image diffusion model family built from FLUX.2 Klein 4B for local inference on Apple Silicon and NVIDIA GPUs. The launch includes 1-bit and ternary variants, plus Bonsai Studio for trying the model on iPhone.

OPEN SOURCE3h ago

book-to-skill turns PDFs into Claude skills

book-to-skill converts technical PDFs and EPUBs into a reusable Claude Code skill with chapter files, a glossary, patterns, and a cheat sheet. The goal is to turn a book from something you read once into something an agent can query while you work.