YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Karpathy's Autoresearch slashes eCLIP mean rank

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Karpathy's Autoresearch slashes eCLIP mean rank
OPEN LINK ↗
// 66d agoRESEARCH PAPER

Karpathy's Autoresearch slashes eCLIP mean rank

Yogesh Kumar applied Karpathy's autoresearch loop to an old eCLIP research codebase, letting Claude Code iterate on `train.py` inside a locked-down containerized sandbox. In 42 runs over one Saturday, the agent cut validation mean rank by 54%, mostly by fixing a temperature clamp and retuning hyperparameters.

// ANALYSIS

This is a strong proof of concept for agentic research, but the real story is scoping: once the task is bounded by a single metric, a single file, and a hard time budget, the agent can do useful work. It looks less like an autonomous scientist and more like a very fast ablation engine that still needs a human to set the question.

  • `program.md` is the real control surface, effectively acting like a lightweight operating system for the agent.
  • The sandbox and permission lock mattered as much as the model, because they kept the loop safe and reviewable.
  • The biggest gain came from a temperature clamp bug fix, which says a lot about how much low-hanging fruit still hides in research code.
  • Hyperparameter tuning delivered more value than architectural changes, which is exactly the kind of search current agents are good at.
  • Once the exploration moved into moonshot ideas, success dropped sharply, showing the ceiling of today’s autonomous research loops.
// TAGS
autoresearchagentresearchai-codingautomationopen-source

DISCOVERED

66d ago

2026-03-23

PUBLISHED

67d ago

2026-03-23

RELEVANCE

8/ 10

AUTHOR

ykumards