YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

FINAL-Bench’s Darwin-36B-Opus hits 88.4% on GPQA Diamond

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

FINAL-Bench’s Darwin-36B-Opus hits 88.4% on GPQA Diamond
OPEN LINK ↗
// 45d agoBENCHMARK RESULT

FINAL-Bench’s Darwin-36B-Opus hits 88.4% on GPQA Diamond

Darwin-36B-Opus is a 36-billion-parameter mixture-of-experts language model released on Hugging Face by FINAL-Bench and built with the Darwin V7 evolutionary breeding engine. It is derived from two public parents: Qwen/Qwen3.6-35B-A3B as the father and a Claude Opus 4.6 reasoning-distilled variant of that base as the mother. The release claims the model preserves the distilled reasoning behavior of the mother while keeping the father’s expert topology, and says the automated breeding process can produce a bfloat16 checkpoint in under an hour on a single GPU. Its headline result is 88.4% on GPQA Diamond, which the post presents as a new high point for the Darwin family.

// ANALYSIS

Strong benchmark-driven release, but the real story is the evolutionary training pipeline rather than a brand-new base model.

  • The model is positioned as a recombination of two public Qwen-derived parents, not a conventional retrain from scratch.
  • The claimed 88.4% GPQA Diamond score is the main proof point and the reason this post reads like a benchmark release.
  • If the result holds up across independent evals, this is notable because it suggests the Darwin pipeline can reliably preserve reasoning gains through breeding.
  • The practical appeal is portability: a deployable bf16 checkpoint and Hugging Face availability make it easy for the open-model crowd to test.
// TAGS
llmmoehuggingfaceqwenreasoningbenchmarkgpqaopen-source

DISCOVERED

45d ago

2026-04-25

PUBLISHED

45d ago

2026-04-25

RELEVANCE

9/ 10

AUTHOR

jacek2023