OPEN_SOURCE ↗
REDDIT · REDDIT// 6h agoNEWS
Qwen3.5-35B-A3B Pushes Mac Memory Limits
A Reddit thread asks whether an M5 Pro with 24GB unified memory can comfortably run Qwen3.5-35B-A3B or dense 27B models locally. The replies lean hard toward 48GB, with users saying 24GB runs into memory pressure fast once context length grows.
// ANALYSIS
The short answer is that 24GB may be fine for general development, but it is not the comfortable tier for local 35B-class LLM hobby work. If you want room for larger context, fewer compromises, and less babysitting, 48GB is the safer buy.
- –Qwen3.5-35B-A3B is a 35B-total MoE model with 3B active parameters, so it is efficient for its class but still not "small" in memory terms.
- –The thread's comments are consistent: 24GB is described as barely enough, while 48GB avoids the yellow-zone memory pressure people hit on Apple Silicon.
- –For local inference, unified memory matters as much as raw CPU/GPU speed because KV cache and context quickly eat the available headroom.
- –If the goal is occasional experimentation, 24GB can work with aggressive quantization and shorter contexts; if the goal is a serious local LLM setup, 48GB is the practical minimum.
// TAGS
llmself-hostedinferenceqwen3-5-35b-a3bapple-silicon
DISCOVERED
6h ago
2026-04-18
PUBLISHED
7h ago
2026-04-18
RELEVANCE
6/ 10
AUTHOR
umutkarakoc