OPEN_SOURCE ↗
REDDIT · REDDIT// 6h agoMODEL RELEASE
Qwen3.6-35B-A3B loops, repeats in agents
This Reddit thread says Qwen3.6-27B Q8 behaves normally in Agent Zero, while Qwen3.6-35B-A3B Q8 falls into constant loops at 256k context and becomes hard to use. Replies point to sampling settings, especially temperature and presence penalty, as the likely lever.
// ANALYSIS
The pattern looks less like a universal model failure and more like a bad fit between this MoE model, long-context agent orchestration, and the default decoding preset. Qwen’s own docs recommend higher-temperature thinking configs and note that presence_penalty tuning can reduce endless repetition, which matches the community reports here.
- –Qwen3.6-35B-A3B is a sparse MoE with 35B total and 3B active parameters, so it can be more sensitive to generation settings than the denser 27B variant.
- –Official guidance for thinking mode is much hotter than many local defaults: `temperature=1.0`, `top_p=0.95`, `presence_penalty=1.5`, `repetition_penalty=1.0`.
- –The thread’s replies suggest low-temperature presets can trigger loops, while moderate temperature increases help more than aggressive repetition penalties.
- –If this shows up mainly in Agent Zero, the wrapper likely needs model-specific presets rather than a one-size-fits-all local config.
- –For long-context agent runs, the practical fix is usually to adjust sampling first, then inspect the chat template, thinking-mode settings, and any speculative decoding or cache configuration.
// TAGS
qwen3.6-35b-a3bllmreasoningagentself-hostedopen-weights
DISCOVERED
6h ago
2026-05-01
PUBLISHED
9h ago
2026-04-30
RELEVANCE
8/ 10
AUTHOR
Safe-Buffalo-4408