YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Open WebUI Breaks Qwen 3.6 Thinking

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Open WebUI Breaks Qwen 3.6 Thinking
OPEN LINK ↗
// 1h agoNEWS

Open WebUI Breaks Qwen 3.6 Thinking

Open WebUI users report Qwen 3.6 with llama.cpp losing its `preserve_thinking` behavior, even though the same model works in llama.cpp's own web UI. Open WebUI docs say it only preserves reasoning that the model actually returns, and a current GitHub discussion points to `reasoning_content` being mishandled on reinjection.

// ANALYSIS

This looks more like a client-side compatibility bug than a model issue: the backend can emit reasoning, but Open WebUI may be serializing or replaying it in the wrong shape. Open WebUI’s docs say it can only preserve reasoning that the model actually returns, and the GitHub discussion suggests `reasoning_content` is being stripped or moved into the wrong field on the next turn. That would break agentic workflows, which makes the llama.cpp native UI a better reference implementation for now because it passes the chat-template kwargs through more directly. If you need the feature today, the likely fix is a pipe/filter or a targeted Open WebUI issue or PR rather than a hidden toggle.

// TAGS
open-webuillama-cppreasoningllmapidebuggingself-hosted

DISCOVERED

1h ago

2026-05-11

PUBLISHED

2h ago

2026-05-11

RELEVANCE

7/ 10

AUTHOR

sterby92