YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

AdamBench ranks local coding LLMs

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

AdamBench ranks local coding LLMs
OPEN LINK ↗
// 61d agoBENCHMARK RESULT

AdamBench ranks local coding LLMs

AdamBench is a self-published benchmark for local LLMs in a simple agentic-coding workflow, run on an RTX 5080 16GB + 64GB RAM workstation. The repo includes prompts, review outputs, methodology, and visualizations; Qwen3.5 122b A10b won overall, while Qwen3.5 35b A3b and gpt-oss-20b/120b look like the most practical daily picks.

// ANALYSIS

This feels less like a universal leaderboard and more like a brutally honest local-model reality check, which is exactly why it’s useful. The score favors not just raw coding quality, but also how well a model survives iterative repair loops without wasting time or tokens.

  • Qwen3.5 122b A10b takes the top AdamBench score, but Qwen3.5 35b A3b is the author’s daily driver because it balances quality, speed, and context headroom.
  • gpt-oss-120b and gpt-oss-20b are the standout surprises: fast for their size and unusually token-efficient, which matters a lot in agentic coding.
  • Nemotron models lag hard on quality and efficiency; even the best one lands around the top 10, with huge reasoning-token overhead.
  • The benchmark is intentionally single-run and self-repair heavy, so it measures real workflow resilience more than clean one-shot coding ability.
  • Models that failed on tool calling or chat templates were excluded, which is harsh but sensible if the goal is practical local usability.
// TAGS
adambenchbenchmarkai-codingagentllmopen-sourceself-hosted

DISCOVERED

61d ago

2026-03-26

PUBLISHED

62d ago

2026-03-26

RELEVANCE

9/ 10

AUTHOR

Real_Ebb_7417