OPEN_SOURCE ↗
REDDIT · REDDIT// 20d agoMODEL RELEASE
Qwen3.5 user pushes back on overthinking
A Reddit user says Qwen3.5 35B-A3B and 27B produce concise, high-quality answers under stock llama.cpp settings, without the runaway reasoning people describe. They say both Unsloth-recommended and Qwen-recommended sampler presets produced garbled output, while removing custom params fixed the experience.
// ANALYSIS
My read: the "overthinking" reputation looks at least partly like a reproducibility problem, not a model flaw. The same Qwen3.5 can look crisp or chaotic depending on the sampler, prompt template, and tool stack around it.
- –The poster keeps the setup intentionally small: defaults only, one local server, and four simple tools.
- –That makes the anecdote useful because it strips out a lot of agentic complexity that can inflate reasoning traces.
- –Qwen's own model card says Qwen3.5 thinks by default and recommends specific sampling settings, so "pure defaults" is a legitimate baseline.
- –The post is a reminder that local-model anecdotes are hard to compare unless people share prompts, temperatures, top-p values, context size, and tool definitions.
- –This does not disprove the overthinking complaints, but it does suggest they may be workflow-specific rather than universal.
// TAGS
qwen-3.5llmreasoninginferenceself-hostedopen-weightsprompt-engineering
DISCOVERED
20d ago
2026-03-22
PUBLISHED
20d ago
2026-03-22
RELEVANCE
9/ 10
AUTHOR
wadeAlexC