Qwen3.6-35B-A3B coding hits 32GB RAM wall

// 90d agoOPENSOURCE RELEASE

Qwen3.6-35B-A3B coding hits 32GB RAM wall

A developer report on running the Qwen3.6-35B-A3B MoE model for local agentic coding on a 32GB Mac reveals critical context management hurdles. While the model shows frontier-level reasoning, the 32k token context limit imposed by hardware constraints leads to reasoning failure during complex repository-wide tasks.

// ANALYSIS

Local LLMs are reaching frontier performance, but 32GB of RAM is becoming the new bottleneck for real-world agentic workflows.

–Qwen 3.6-35B excels in benchmarks but struggles with context compaction in local loops like OpenCode and Claude Code.
–32k context is insufficient for "rooting around" non-trivial codebases, leading to hallucinated file paths and loss of task state.
–Disabling subagents provides a temporary memory reprieve but fails as the reasoning chain extends beyond the second compaction pass.
–The failure highlights a growing gap between model "thinking" capabilities and the memory overhead required for persistent local agency.

// TAGS

qwen3.6-35b-a3bllmai-codingagentcliopen-weightsopencode

DISCOVERED

90d ago

2026-04-20

PUBLISHED

90d ago

2026-04-19

RELEVANCE

8/ 10

AUTHOR

boutell

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

BENCHMARK40m ago

Kimi K3 matches top models in Aikido benchmark

Aikido Security has added Moonshot's Kimi K3 open-weight model to its AI Code Analysis benchmark, which tests models on rediscovering 26 known vulnerabilities (CVEs). At pass@3, Kimi K3 successfully identified 23 of the 26 CVEs, matching the performance of top-tier models.

OPEN SOURCE49m ago

Windows Terminal consolidates command-line interfaces

Windows Terminal is Microsoft's modern, open-source console host that consolidates Command Prompt, PowerShell, and WSL into a tabbed interface. It features GPU-accelerated text rendering, deep JSON customizability, and rich Unicode support.

OPEN SOURCE49m ago

KTransformers runs 100B+ LLMs on consumer hardware

Developed by the kvcache-ai community, KTransformers is a heterogeneous CPU-GPU inference framework designed to run massive 100B+ MoE models on consumer-grade hardware. By utilizing AMX-specialized CPU kernels and asynchronous task scheduling, it offloads weight matrices dynamically between VRAM and system memory to achieve high processing speeds.

Qwen3.6-35B-A3B coding hits 32GB RAM wall