BACK_TO_FEEDAICRIER_2
Qwen3.5-9B hits thinking loops in Ollama
OPEN_SOURCE ↗
REDDIT · REDDIT// 26d agoMODEL RELEASE

Qwen3.5-9B hits thinking loops in Ollama

Users are reporting that Alibaba’s newly released Qwen3.5-9B model enters infinite internal reasoning cycles when deployed via Ollama and OpenWebUI. This "thinking loop" behavior often manifests as repetitive plan-checking monologues, preventing the model from delivering a final answer despite its high reasoning benchmarks.

// ANALYSIS

Qwen 3.5 9B’s hyper-efficient reasoning architecture is a double-edged sword: it punches way above its weight class but oscillates wildly without strict constraint parameters.

  • The issue is often tied to high temperatures (1.0+) in small quantized versions; lowering temperature to 0.7-0.8 typically stabilizes the internal monologue.
  • Ollama's native `--think=false` flag or the `/set nothink` command can force-disable the reasoning path to bypass the loop entirely.
  • System prompts that explicitly limit reasoning steps to a fixed number (e.g., "Analyze in max 3 steps") have proven effective at forcing termination.
  • With a 256K native context and GPQA scores topping GPT-4o, the model is clearly optimized for "thinking" which UI wrappers aren't yet perfectly tuned to handle.
// TAGS
qwen3-5-9bllmollamareasoningopen-sourceself-hosted

DISCOVERED

26d ago

2026-03-16

PUBLISHED

26d ago

2026-03-16

RELEVANCE

9/ 10

AUTHOR

Xyhelia