OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoTUTORIAL
Qwen3.6-35B-A3B guide targets 32GB Macs
This is a hands-on guide for running Qwen3.6-35B-A3B locally on an M2 MacBook Pro with 32GB RAM using llama.cpp and OpenCode. The post argues that quantization, a 128K context cap, and careful RAM discipline make a surprisingly capable local coding setup practical, if still fragile.
// ANALYSIS
The real story here is not just “local AI on a Mac,” but how much system engineering it takes to make a frontier-ish coding model usable under tight memory pressure.
- –The setup leans on a quantized GGUF checkpoint plus `mmproj` support so the model can handle both code and screenshots through llama.cpp.
- –Qwen3.6-35B-A3B is positioned as an efficient open-weight MoE model with 35B total and 3B active parameters, so the value prop is density, not brute force.
- –OpenCode matters because it makes local models feel like a real coding agent instead of a toy terminal chat.
- –The author’s results are nuanced: solid on adapter-style, test-driven work, weaker on geometry-heavy UI debugging and large integration hunts.
- –The tuning advice is practical: keep context high enough to avoid collapse, but leave enough headroom for unified memory, browser tabs, and the rest of the machine.
// TAGS
qwen3.6-35b-a3bopencodellama.cppai-codingagentcliself-hostedmultimodal
DISCOVERED
5h ago
2026-04-25
PUBLISHED
7h ago
2026-04-25
RELEVANCE
9/ 10
AUTHOR
boutell