Qwen, Gemma vie for M4 coding
A LocalLLaMA thread asks which local models make sense for heavy coding on a 24GB MacBook M4, with commenters pointing to Qwen and Gemma-class models while warning that cloud models still win for serious coding work. The practical advice centers on smaller quantized models, MLX-aware runtimes, and tools like Ollama or LM Studio.
The useful takeaway is not "run the biggest model you can squeeze in"; it is that 24GB Apple Silicon is a capable local inference box, but still a compromise for coding agents.
- –Qwen gets the strongest community nod, with 14B-class models fitting comfortably and 30B-ish quantized models possible but tighter on memory and context
- –Gemma is framed as a safer fit on 24GB, while larger Qwen variants may trade too much speed or headroom for quality
- –For actual heavy coding, commenters still favor Claude, Codex, or other paid remote models, using local LLMs for drafts, private tasks, or backend automation
- –MLX support matters on Mac because memory bandwidth and Apple Silicon-native loaders can make the difference between usable and frustrating
DISCOVERED
45d ago
2026-04-21
PUBLISHED
45d ago
2026-04-21
RELEVANCE
AUTHOR
Extra-Perception2408
