BACK_TO_FEEDAICRIER_2
Qwen3.5-35B-A3B Pushes Mac Memory Limits
OPEN_SOURCE ↗
REDDIT · REDDIT// 6h agoNEWS

Qwen3.5-35B-A3B Pushes Mac Memory Limits

A Reddit thread asks whether an M5 Pro with 24GB unified memory can comfortably run Qwen3.5-35B-A3B or dense 27B models locally. The replies lean hard toward 48GB, with users saying 24GB runs into memory pressure fast once context length grows.

// ANALYSIS

The short answer is that 24GB may be fine for general development, but it is not the comfortable tier for local 35B-class LLM hobby work. If you want room for larger context, fewer compromises, and less babysitting, 48GB is the safer buy.

  • Qwen3.5-35B-A3B is a 35B-total MoE model with 3B active parameters, so it is efficient for its class but still not "small" in memory terms.
  • The thread's comments are consistent: 24GB is described as barely enough, while 48GB avoids the yellow-zone memory pressure people hit on Apple Silicon.
  • For local inference, unified memory matters as much as raw CPU/GPU speed because KV cache and context quickly eat the available headroom.
  • If the goal is occasional experimentation, 24GB can work with aggressive quantization and shorter contexts; if the goal is a serious local LLM setup, 48GB is the practical minimum.
// TAGS
llmself-hostedinferenceqwen3-5-35b-a3bapple-silicon

DISCOVERED

6h ago

2026-04-18

PUBLISHED

7h ago

2026-04-18

RELEVANCE

6/ 10

AUTHOR

umutkarakoc