Gemma 4 tops Qwen 3.5 in Go benchmarks
Miguel Filipe published a benchmarking study evaluating local MoE models like Gemma 4 and Qwen 3.5 using a functional Go integration test harness on a Framework 13 laptop. The results reveal a significant gap between theoretical performance and actual functional reliability in quantized local environments.
Functional execution testing is the only benchmark that matters for AI coding, as it exposes the flakiness that synthetic evals hide. Gemma 4 26B-A4B emerged as the winner, proving resistant to quantization degradation, while Qwen 3.5 35B struggled with consistency and compile failures despite its larger parameter count. The study highlights that increased context length and memory bandwidth are critical for success in local MoE architectures on mobile platforms like the Ryzen AI 370HX.
DISCOVERED
1d ago
2026-04-11
PUBLISHED
1d ago
2026-04-10
RELEVANCE
AUTHOR
m3thos