BACK_TO_FEEDAICRIER_2
AI task horizons double every 7 months, METR finds
OPEN_SOURCE ↗
REDDIT · REDDIT// 29d agoRESEARCH PAPER

AI task horizons double every 7 months, METR finds

METR's research shows frontier AI agents have been doubling the length of tasks they can complete autonomously every 7 months since 2019, with recent acceleration to a 4-month doubling rate. At this trajectory, agents handling month-long software tasks could arrive within 2-4 years.

// ANALYSIS

This is the Moore's Law moment for AI agency — a clean exponential curve that makes vague AI hype claims concrete and measurable.

  • METR tested frontier models across ~230 tasks, finding R²=0.83 correlation between task length and agent success — unusually tight for AI benchmarks
  • Current frontier models (e.g. Claude 3.7 Sonnet) succeed on tasks taking humans a few minutes but fail 90%+ of the time on 4-hour tasks — the "why isn't AI replacing me yet" gap explained
  • 7-month doubling since 2019, now accelerating to 4 months, suggests the 2026-2027 window is when multi-day autonomous task completion becomes routine
  • The self-referential implication is stark: if agents can automate AI research, the doubling rate itself could accelerate — METR explicitly flags this flywheel risk
  • Benchmark R² of 0.83 is high but not predictive of discontinuous jumps; a single architectural breakthrough could shatter the curve in either direction
// TAGS
llmagentbenchmarkresearch

DISCOVERED

29d ago

2026-03-14

PUBLISHED

33d ago

2026-03-10

RELEVANCE

8/ 10

AUTHOR

EchoOfOppenheimer