BACK_TO_FEEDAICRIER_2
Qwen3.6-35B-A3B guide targets 32GB Macs
OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoTUTORIAL

Qwen3.6-35B-A3B guide targets 32GB Macs

This is a hands-on guide for running Qwen3.6-35B-A3B locally on an M2 MacBook Pro with 32GB RAM using llama.cpp and OpenCode. The post argues that quantization, a 128K context cap, and careful RAM discipline make a surprisingly capable local coding setup practical, if still fragile.

// ANALYSIS

The real story here is not just “local AI on a Mac,” but how much system engineering it takes to make a frontier-ish coding model usable under tight memory pressure.

  • The setup leans on a quantized GGUF checkpoint plus `mmproj` support so the model can handle both code and screenshots through llama.cpp.
  • Qwen3.6-35B-A3B is positioned as an efficient open-weight MoE model with 35B total and 3B active parameters, so the value prop is density, not brute force.
  • OpenCode matters because it makes local models feel like a real coding agent instead of a toy terminal chat.
  • The author’s results are nuanced: solid on adapter-style, test-driven work, weaker on geometry-heavy UI debugging and large integration hunts.
  • The tuning advice is practical: keep context high enough to avoid collapse, but leave enough headroom for unified memory, browser tabs, and the rest of the machine.
// TAGS
qwen3.6-35b-a3bopencodellama.cppai-codingagentcliself-hostedmultimodal

DISCOVERED

5h ago

2026-04-25

PUBLISHED

7h ago

2026-04-25

RELEVANCE

9/ 10

AUTHOR

boutell