YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Qwen3.5 397B quant hits 93% MMLU

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Qwen3.5 397B quant hits 93% MMLU
OPEN LINK ↗
// 68d agoBENCHMARK RESULT

Qwen3.5 397B quant hits 93% MMLU

A community MLX quantization of Qwen3.5-397B-A17B claims 93% on a 200-question MMLU run while fitting into 180GB and sustaining about 38 tokens per second on M3 Ultra hardware. The post is really a local-inference benchmark story, not a new base model release.

// ANALYSIS

This is a strong reminder that the local-model arms race is shifting from “can it run?” to “which quantization preserves quality without killing speed?”

  • The underlying official model is Qwen3.5-397B-A17B, a 397B-total, 17B-active MoE model; the community quant here is trying to squeeze frontier-class capability into practical Apple Silicon memory budgets.
  • The headline 93% figure is self-reported on a 200-question MMLU slice, so it’s interesting but not directly comparable to the official Qwen benchmark table, which reports MMLU-Pro and other standardized evals.
  • The meaningful angle for developers is the tradeoff curve: this build appears smaller than some other MLX 4-bit ports while claiming better throughput, which matters if you care about interactive local usage.
  • The author’s note about weaker coding performance lines up with the usual pattern: quantization and MoE routing can preserve reasoning scores better than they preserve messy real-world coding behavior.
  • If others replicate the speed claim, this becomes one of the more compelling “big model, local enough” options for experimentation on high-memory Macs.
// TAGS
qwen3.5llmbenchmarkopen-weightsmlxinferencereasoning

DISCOVERED

68d ago

2026-03-20

PUBLISHED

68d ago

2026-03-20

RELEVANCE

9/ 10

AUTHOR

HealthyCommunicat