YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Qwen 3.5 MoE tops Gemma 4 M5 benchmarks

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Qwen 3.5 MoE tops Gemma 4 M5 benchmarks
OPEN LINK ↗
// 55d agoBENCHMARK RESULT

Qwen 3.5 MoE tops Gemma 4 M5 benchmarks

Performance benchmarks on the MacBook M5 (128GB RAM) utilizing the oMLX framework demonstrate that Qwen 3.5 MoE remains the throughput leader for local agentic workloads, despite Gemma 4's gains in responsiveness. The results highlight the M5's new Neural Accelerator, which provides up to 4x faster prompt processing, and the efficacy of oMLX’s tiered KV caching in reducing latency for long-context multi-turn interactions.

// ANALYSIS

The M5 Max and oMLX are turning local Macs into viable high-performance inference servers, with MoE architectures clearly winning on Apple Silicon.

  • Qwen 3.5 MoE (35B-A3B) is the current performance champion, achieving 92.2 tok/s for generation and nearly 2,850 tok/s for prompt processing.
  • oMLX's tiered KV caching leverages SSD storage to restore context prefixes in under 2 seconds, a massive improvement over the 60+ second prefill times seen in standard MLX implementations.
  • The M5's Neural Accelerator specifically boosts the prefill stage, making dense models more responsive but not yet competitive with MoE throughput.
  • While Gemma 4 is more memory-efficient and responsive for "edge" tasks, it lags behind Qwen in sustained batching and serving performance for heavy developer workloads.
  • SSD-based context persistence is becoming the new baseline for "agentic" local LLM tools like Claude Code and Cursor.
// TAGS
omlxmlxapple-siliconllmbenchmarkgemma-4qwen3-5edge-ai

DISCOVERED

55d ago

2026-04-03

PUBLISHED

55d ago

2026-04-02

RELEVANCE

8/ 10

AUTHOR

onil_gova