Qwen3.6-35B-A3B fails closing CoT token

// 90d agoMODEL RELEASE

Qwen3.6-35B-A3B fails closing CoT token

Alibaba's new sparse MoE model occasionally outputs the multi-token string </thinking> instead of the dedicated </think> closing token during reasoning. This minor regression breaks API adapters and coding harnesses that rely on precise token detection to separate internal reasoning from final output.

// ANALYSIS

This "infinite thinking" bug highlights the fragility of CoT-enabled models when paired with strict regex-based output parsers. The mismatch between the model's intended vocabulary and its generated output suggests training or quantization edge cases, with observed issues occurring across context lengths from 16k to 128k. While quantization like IQ4_NL may exacerbate the behavior, workarounds involve manual Jinja template adjustments or the use of specific reasoning parsers like the vLLM qwen3 implementation.

// TAGS

qwen3.6-35b-a3bqwenllmreasoningai-codingopen-weightsquantization

DISCOVERED

90d ago

2026-04-19

PUBLISHED

90d ago

2026-04-19

RELEVANCE

8/ 10

AUTHOR

Confident_Ideal_5385

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS59m ago

OpenCode founder replaces editor with voice

OpenCode co-founder Dax Raad (@thdxr) shared that fast local voice models have replaced traditional editors and coding agents in his workflow. He noted that the speed of voice dictation beats keyboard-driven navigation, especially as LLMs are now robust enough to infer intent from rambling speech.

UPDATE2h ago

open-slide ships Keynote-style Morph Transition

The open-slide presentation framework has launched Morph Transition, enabling Keynote-style Magic Move animation effects. Powered by a new MorphElement component, the framework automatically handles motion, resizing, and color transitions, allowing AI coding agents to build them from natural language prompts.

MODEL3h ago

OpenRouter adds nine new AI models

Unified API provider OpenRouter has added nine major new AI models to its platform, highlighted by Moonshot AI's Kimi K3, Meta AI's Muse Spark 1.1, and Thinking Machines Lab's Inkling. The additions provide developers with immediate API access to these frontier systems for tasks ranging from long-horizon coding and tool use to multimodal reasoning.