Qwen3.6-35B-A3B loops, repeats in agents
This Reddit thread says Qwen3.6-27B Q8 behaves normally in Agent Zero, while Qwen3.6-35B-A3B Q8 falls into constant loops at 256k context and becomes hard to use. Replies point to sampling settings, especially temperature and presence penalty, as the likely lever.
The pattern looks less like a universal model failure and more like a bad fit between this MoE model, long-context agent orchestration, and the default decoding preset. Qwen’s own docs recommend higher-temperature thinking configs and note that presence_penalty tuning can reduce endless repetition, which matches the community reports here.
- –Qwen3.6-35B-A3B is a sparse MoE with 35B total and 3B active parameters, so it can be more sensitive to generation settings than the denser 27B variant.
- –Official guidance for thinking mode is much hotter than many local defaults: `temperature=1.0`, `top_p=0.95`, `presence_penalty=1.5`, `repetition_penalty=1.0`.
- –The thread’s replies suggest low-temperature presets can trigger loops, while moderate temperature increases help more than aggressive repetition penalties.
- –If this shows up mainly in Agent Zero, the wrapper likely needs model-specific presets rather than a one-size-fits-all local config.
- –For long-context agent runs, the practical fix is usually to adjust sampling first, then inspect the chat template, thinking-mode settings, and any speculative decoding or cache configuration.
DISCOVERED
51d ago
2026-05-01
PUBLISHED
51d ago
2026-04-30
RELEVANCE
AUTHOR
Safe-Buffalo-4408