OPEN_SOURCE ↗
REDDIT · REDDIT// 26d agoMODEL RELEASE
Qwen3.5-9B hits thinking loops in Ollama
Users are reporting that Alibaba’s newly released Qwen3.5-9B model enters infinite internal reasoning cycles when deployed via Ollama and OpenWebUI. This "thinking loop" behavior often manifests as repetitive plan-checking monologues, preventing the model from delivering a final answer despite its high reasoning benchmarks.
// ANALYSIS
Qwen 3.5 9B’s hyper-efficient reasoning architecture is a double-edged sword: it punches way above its weight class but oscillates wildly without strict constraint parameters.
- –The issue is often tied to high temperatures (1.0+) in small quantized versions; lowering temperature to 0.7-0.8 typically stabilizes the internal monologue.
- –Ollama's native `--think=false` flag or the `/set nothink` command can force-disable the reasoning path to bypass the loop entirely.
- –System prompts that explicitly limit reasoning steps to a fixed number (e.g., "Analyze in max 3 steps") have proven effective at forcing termination.
- –With a 256K native context and GPQA scores topping GPT-4o, the model is clearly optimized for "thinking" which UI wrappers aren't yet perfectly tuned to handle.
// TAGS
qwen3-5-9bllmollamareasoningopen-sourceself-hosted
DISCOVERED
26d ago
2026-03-16
PUBLISHED
26d ago
2026-03-16
RELEVANCE
9/ 10
AUTHOR
Xyhelia