BACK_TO_FEEDAICRIER_2
Qwen, Gemma vie for M4 coding
OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoINFRASTRUCTURE

Qwen, Gemma vie for M4 coding

A LocalLLaMA thread asks which local models make sense for heavy coding on a 24GB MacBook M4, with commenters pointing to Qwen and Gemma-class models while warning that cloud models still win for serious coding work. The practical advice centers on smaller quantized models, MLX-aware runtimes, and tools like Ollama or LM Studio.

// ANALYSIS

The useful takeaway is not "run the biggest model you can squeeze in"; it is that 24GB Apple Silicon is a capable local inference box, but still a compromise for coding agents.

  • Qwen gets the strongest community nod, with 14B-class models fitting comfortably and 30B-ish quantized models possible but tighter on memory and context
  • Gemma is framed as a safer fit on 24GB, while larger Qwen variants may trade too much speed or headroom for quality
  • For actual heavy coding, commenters still favor Claude, Codex, or other paid remote models, using local LLMs for drafts, private tasks, or backend automation
  • MLX support matters on Mac because memory bandwidth and Apple Silicon-native loaders can make the difference between usable and frustrating
// TAGS
qwengemmallmai-codinginferenceedge-aiopen-weights

DISCOVERED

5h ago

2026-04-21

PUBLISHED

6h ago

2026-04-21

RELEVANCE

6/ 10

AUTHOR

Extra-Perception2408