Gemma 4 31B beats Qwen in Pac-Man test
In a local one-shot Pac-Man gamedev contest on a MacBook Pro M5 Max, Gemma 4 31B won by producing a shorter, clearer, and more functional game than Qwen 3.6 27B. The post argues that for local coding tasks, output quality and execution speed matter more than raw tokens per second.
The interesting part here is not the speed number alone; it is that Gemma used far fewer tokens and still delivered the more usable result. For one-shot game generation, that is often the better tradeoff because the model has to get logic, state, and interactions right in one pass.
- –Gemma finished much faster and with far less token spend, which matters on local hardware where iteration speed is part of the product
- –Pac-Man is a decent stress test for codegen because it forces pathfinding, collision handling, game state, and rendering to work together
- –Qwen’s longer output may have been more creative, but verbosity is a liability when the goal is a playable result, not just an impressive transcript
- –For local LLM workflows, “best” is often task-specific: concise correctness beats elaborate prose for gameplay code
- –The post reinforces a broader point for AI builders: benchmark models on end-to-end utility, not just raw throughput
DISCOVERED
51d ago
2026-05-01
PUBLISHED
51d ago
2026-05-01
RELEVANCE
AUTHOR
gladkos