BACK_TO_FEEDAICRIER_2
Qwen3.6-35B-A3B fails closing CoT token
OPEN_SOURCE ↗
REDDIT · REDDIT// 7h agoMODEL RELEASE

Qwen3.6-35B-A3B fails closing CoT token

Alibaba's new sparse MoE model occasionally outputs the multi-token string </thinking> instead of the dedicated </think> closing token during reasoning. This minor regression breaks API adapters and coding harnesses that rely on precise token detection to separate internal reasoning from final output.

// ANALYSIS

This "infinite thinking" bug highlights the fragility of CoT-enabled models when paired with strict regex-based output parsers. The mismatch between the model's intended vocabulary and its generated output suggests training or quantization edge cases, with observed issues occurring across context lengths from 16k to 128k. While quantization like IQ4_NL may exacerbate the behavior, workarounds involve manual Jinja template adjustments or the use of specific reasoning parsers like the vLLM qwen3 implementation.

// TAGS
qwen3.6-35b-a3bqwenllmreasoningai-codingopen-weightsquantization

DISCOVERED

7h ago

2026-04-19

PUBLISHED

9h ago

2026-04-19

RELEVANCE

8/ 10

AUTHOR

Confident_Ideal_5385