Open WebUI Breaks Qwen 3.6 Thinking
Open WebUI users report Qwen 3.6 with llama.cpp losing its `preserve_thinking` behavior, even though the same model works in llama.cpp's own web UI. Open WebUI docs say it only preserves reasoning that the model actually returns, and a current GitHub discussion points to `reasoning_content` being mishandled on reinjection.
This looks more like a client-side compatibility bug than a model issue: the backend can emit reasoning, but Open WebUI may be serializing or replaying it in the wrong shape. Open WebUI’s docs say it can only preserve reasoning that the model actually returns, and the GitHub discussion suggests `reasoning_content` is being stripped or moved into the wrong field on the next turn. That would break agentic workflows, which makes the llama.cpp native UI a better reference implementation for now because it passes the chat-template kwargs through more directly. If you need the feature today, the likely fix is a pipe/filter or a targeted Open WebUI issue or PR rather than a hidden toggle.
DISCOVERED
1h ago
2026-05-11
PUBLISHED
2h ago
2026-05-11
RELEVANCE
AUTHOR
sterby92