LM Studio steers M4 Max users to Qwen3-Coder

// 60d agoINFRASTRUCTURE

LM Studio steers M4 Max users to Qwen3-Coder

A Reddit user who upgraded from a 16GB base M5 to an M4 Max 36GB asks which local coding model makes the most sense in LM Studio. The thread quickly lands on Qwen3-Coder-30B-A3B as the practical Apple Silicon pick for pure coding.

// ANALYSIS

36GB unified memory is the sweet spot where local coding stops being a toy problem and becomes a real workflow. Qwen3-Coder-30B-A3B wins here because it keeps active parameters low while leaving enough headroom for context and repo-scale work.

–`Qwen3-Coder-30B-A3B` is a 30.5B MoE model with only 3.3B active weights, so its runtime footprint is much friendlier than the raw parameter count suggests.
–LM Studio supports the model in both GGUF and MLX, and the Mac-friendly MLX path is the right way to squeeze more speed out of Apple Silicon.
–For pure coding, a specialist model beats a generalist one at this memory tier; `Qwen3.5-35B-A3B` is the fallback only if you want more breadth than code focus.
–The real constraint on a 36GB Mac is not whether a model fits, but how much context and KV-cache headroom you can keep without slowing the machine down.

// TAGS

llmai-codinginferenceedge-aiself-hostedlm-studioqwen3-coderm4-max

DISCOVERED

60d ago

2026-03-28

PUBLISHED

60d ago

2026-03-28

RELEVANCE

8/ 10

AUTHOR

Mewsreply

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS21m ago

Anthropic hits profitability as Claude Code usage surges

Anthropic achieved its first operating profit in Q2 2026, driven by a massive shift toward usage-based enterprise pricing. The company's agentic CLI, Claude Code, has become its primary revenue engine by consuming high volumes of tokens for autonomous coding tasks.

NEWS21m ago

Anthropic hits first profit on $10.9B Q2 revenue

Anthropic is poised to record its first operating profit in Q2 2026, driven by a massive $10.9 billion revenue run and a strategic pivot to enterprise sales. The financial turnaround highlights the explosive monetization potential of developer-focused coding agents like Claude Code.

OPEN SOURCE35m ago

Antirez adds distributed inference to DwarfStar

Salvatore Sanfilippo (antirez) has released a major update to DwarfStar, a specialized local inference engine designed for the DeepSeek V4 model family. The new "distributed inference" feature uses layer sharding to split massive models like the 284B DeepSeek V4 PRO across multiple networked machines, enabling frontier-level performance on a cluster of consumer-grade Macs or PCs.