Qwen3.6 35B A3B Coheres Better on Q8

// 53d agoMODEL RELEASE

Qwen3.6 35B A3B Coheres Better on Q8

A LocalLLaMA user says Qwen3.6-35B-A3B fell apart in a low-bit IQ4_XS quant, but became rock solid after moving to an Unsloth UD Q8 build, even with throughput cut to about 40 tok/s on a 24GB card. The model then stayed coherent through dozens of agent tool calls, including a self-written web-search extension.

// ANALYSIS

This reads less like a benchmark and more like a reminder that agentic coding punishes lossy quantization hard. For long tool-heavy sessions, quality and memory plumbing can matter more than raw speed.

–Qwen’s own model card emphasizes agentic coding and “thinking preservation,” so the report fits the release’s intended use case
–The contrast between IQ4_XS and Q8 suggests ultra-low-bit quants may be fine for chat, but still too brittle for sustained agent loops
–On 24GB VRAM, the real tradeoff is reliability versus latency: Q8 plus CPU MoE offload is slower, but apparently far steadier
–The llama.cpp serving flags matter here too; context handling, MTP, and MoE offload choices can change whether the model stays on track
–If this holds up across more users, Qwen3.6 looks more compelling as a local agent model than as a pure throughput play

// TAGS

qwen3.6-35b-a3bllmagentai-codinginferenceopen-source

DISCOVERED

53d ago

2026-04-21

PUBLISHED

53d ago

2026-04-21

RELEVANCE

9/ 10

AUTHOR

s1mplyme

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE35m ago

Anthropic reverses course on Claude Fable 5 safeguards

Anthropic has updated its safety policy for Claude Fable 5 following pushback from developers over invisible safeguards that silently degraded queries. In response to concerns about unpredictability and transparency in agentic workflows, Anthropic committed to a visible fallback mechanism, openly routing flagged queries to Claude Opus 4.8 instead of silently degrading performance.

POLICY46m ago

U.S. Commerce Department export controls force Anthropic to suspend global access to its Fable 5 and Mythos 5 models.

Anthropic has suspended global access to its newly released frontier AI models, Fable 5 and Mythos 5, following a directive from the U.S. Commerce Department citing national security export controls. The order prohibited Anthropic from distributing the models to any foreign national. Because Anthropic cannot reliably distinguish foreign nationals from domestic users in real-time, it chose to completely disable access to both models for all users worldwide. Anthropic has publicly contested the directive, stating the alleged safety vulnerability is minor and already exists in other public models.

POLICY1h ago

Anthropic has suspended access to its Claude Fable 5 AI model after US authorities raised national security concerns.

Anthropic has suspended access to its newly launched Claude Fable 5 AI model just days after its release. The suspension comes in response to national security concerns raised by U.S. authorities, highlighting the growing tension between rapid commercial AI deployment and federal oversight of highly autonomous systems.

Qwen3.6 35B A3B Coheres Better on Q8