YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

DeepSeek-V3.2 GGUFs "eat" think tags

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

DeepSeek-V3.2 GGUFs "eat" think tags
OPEN LINK ↗
// 45d agoNEWS

DeepSeek-V3.2 GGUFs "eat" think tags

Users running Unsloth's DeepSeek-V3.2 GGUF models on llama-server report missing opening <think> tags, which breaks reasoning UI features in tools like Open WebUI. The issue is caused by the chat template prepending the tag to the assistant's response within the prompt, effectively omitting it from the generated output stream.

// ANALYSIS

The missing tag bug is a classic chat template mismatch that highlights the friction between raw GGUF quants and complex reasoning models.

  • The <think> tag is included as a postfix in the prompt template, meaning the model starts generating content after the tag has already been "consumed."
  • Frontends like Open WebUI rely on the literal presence of the <think> tag in the stream to trigger collapsed reasoning blocks; without it, the raw "thought" text leaks into the main UI.
  • The immediate fix is using the --jinja flag in llama-server to ensure the internal engine correctly handles the reasoning field.
  • This recurrence of a known R1-era bug suggests that quantization pipelines for newer DeepSeek versions are still struggling with template consistency across different inference engines.
// TAGS
deepseek-v3-2llama-cppggufunslothreasoningllm

DISCOVERED

45d ago

2026-04-20

PUBLISHED

45d ago

2026-04-20

RELEVANCE

8/ 10

AUTHOR

Winter_Engineer2163