YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Autoresearch plugin brings Karpathy loop to Claude Code

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Autoresearch plugin brings Karpathy loop to Claude Code
OPEN LINK ↗
// 66d agoOPENSOURCE RELEASE

Autoresearch plugin brings Karpathy loop to Claude Code

Autoresearch turns Karpathy's one-file, one-metric experiment loop into a Claude Code plugin for real codebases. On a production Django/pgvector/Cohere search stack, 60 iterations kept only 3 changes, but the run still exposed the real bottlenecks, validated one weighting scheme, and caught a Redis cache-key bug.

// ANALYSIS

The point here isn’t a flashy score jump; it’s that the agent spent its failures buying certainty. A 93% failure rate is a feature when each revert narrows the search space and tells you where to stop tuning.

  • Ranking, not recall, was the bottleneck, so larger candidate pools and title matching were mostly dead ends.
  • The adaptive weighting survived ablation, which is exactly the kind of “we should simplify this” assumption autoresearch can settle fast.
  • Round 2 showed a classic co-optimization trap: prompt changes broke weights, and stale cache keys masked the effect until the caching bug was found.
  • This only works when the eval path is cheap, deterministic, and noisy inputs like temperature are pinned down.
  • The broader win is workflow, not optimization magic: let Claude explore the edges overnight, then spend human time on the architectural ceiling, not manual guesswork.
// TAGS
autoresearchclaude-codeagentai-codingautomationopen-sourceresearch

DISCOVERED

66d ago

2026-03-24

PUBLISHED

66d ago

2026-03-24

RELEVANCE

9/ 10

AUTHOR

hookedonwinter