OPEN_SOURCE ↗
REDDIT · REDDIT// 28d agoTUTORIAL
Reddit asks which laptop runs Qwen3.5-35B-A3B locally
A developer on r/LocalLLaMA asks which budget laptop (~$1000) can run Qwen3.5-35B-A3B locally for private coding. The post compares several configurations — from RTX 4060 with 64GB RAM to RTX 5080 with 64GB — after finding that an MSI Vector GP68 with RTX 4080 12GB VRAM and 64GB RAM achieves 11 t/s.
// ANALYSIS
This is a community help thread, not an announcement — but it surfaces a real insight: VRAM alone doesn't determine local LLM performance; system RAM capacity and CPU-GPU memory sharing are equally critical for large MoE models.
- –Qwen3.5-35B-A3B is a Mixture-of-Experts model that only activates 3.5B parameters per token, making it unusually RAM-friendly despite its 35B parameter count
- –The HP Omen Max with RTX 5080 (16GB VRAM) failed while an older RTX 4080 (12GB VRAM) + 64GB system RAM succeeded — demonstrating that total addressable memory matters more than GPU VRAM alone
- –64GB of system RAM appears to be the threshold that enables this model class on consumer laptops
- –The trend toward CPU-GPU unified or shared memory (seen in Apple Silicon and AMD APUs) may give those platforms an edge for local LLM inference over discrete GPU laptops
// TAGS
llminferenceedge-aiopen-sourceqwen3.5-35b-a3b
DISCOVERED
28d ago
2026-03-15
PUBLISHED
28d ago
2026-03-15
RELEVANCE
5/ 10
AUTHOR
SnooOnions6041