BACK_TO_FEEDAICRIER_2
Reddit asks which laptop runs Qwen3.5-35B-A3B locally
OPEN_SOURCE ↗
REDDIT · REDDIT// 28d agoTUTORIAL

Reddit asks which laptop runs Qwen3.5-35B-A3B locally

A developer on r/LocalLLaMA asks which budget laptop (~$1000) can run Qwen3.5-35B-A3B locally for private coding. The post compares several configurations — from RTX 4060 with 64GB RAM to RTX 5080 with 64GB — after finding that an MSI Vector GP68 with RTX 4080 12GB VRAM and 64GB RAM achieves 11 t/s.

// ANALYSIS

This is a community help thread, not an announcement — but it surfaces a real insight: VRAM alone doesn't determine local LLM performance; system RAM capacity and CPU-GPU memory sharing are equally critical for large MoE models.

  • Qwen3.5-35B-A3B is a Mixture-of-Experts model that only activates 3.5B parameters per token, making it unusually RAM-friendly despite its 35B parameter count
  • The HP Omen Max with RTX 5080 (16GB VRAM) failed while an older RTX 4080 (12GB VRAM) + 64GB system RAM succeeded — demonstrating that total addressable memory matters more than GPU VRAM alone
  • 64GB of system RAM appears to be the threshold that enables this model class on consumer laptops
  • The trend toward CPU-GPU unified or shared memory (seen in Apple Silicon and AMD APUs) may give those platforms an edge for local LLM inference over discrete GPU laptops
// TAGS
llminferenceedge-aiopen-sourceqwen3.5-35b-a3b

DISCOVERED

28d ago

2026-03-15

PUBLISHED

28d ago

2026-03-15

RELEVANCE

5/ 10

AUTHOR

SnooOnions6041