YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Dirac tops TerminalBench on Gemini 3 Flash

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Dirac tops TerminalBench on Gemini 3 Flash
OPEN LINK ↗
// 45d agoBENCHMARK RESULT

Dirac tops TerminalBench on Gemini 3 Flash

Dirac, an open-source coding agent, claims a 65.2% score on TerminalBench 2.0 using Gemini-3-flash-preview. That edges Google’s official 47.8% and Junie CLI’s 64.3%, with the author saying the run used the fully open-source repo and no cheating mechanisms.

// ANALYSIS

Dirac’s result is a reminder that benchmark outcomes are often as much about harness quality as model choice. If the run holds up, it strengthens the case that context curation, edit precision, and tool orchestration can swing agent performance materially.

  • The reported 65.2% TerminalBench 2.0 score would put an open-source agent ahead of both Google’s own submission and the current closed-source leader cited in the post.
  • The author explicitly says no `agents/skills.md` files were inserted, no resource or timeout changes were made, and the exact GitHub codebase was used for the run.
  • Dirac’s positioning around hash-anchored edits, AST-aware manipulation, and token efficiency fits the kind of workflow TerminalBench is meant to stress.
  • The post also highlights a real benchmark problem: if the community doubts compliance, the score matters less than the reproducibility story around it.
  • Until the leaderboard accepts the submission, this reads as a strong but still provisional signal that agent scaffolding can be a competitive advantage.
// TAGS
diraccliopen-sourceai-codingagentbenchmark

DISCOVERED

45d ago

2026-04-27

PUBLISHED

45d ago

2026-04-27

RELEVANCE

9/ 10

AUTHOR

GodelNumbering