YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

llama.cpp update breaks Open WebUI web search

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

llama.cpp update breaks Open WebUI web search
OPEN LINK ↗
// 45d agoINFRASTRUCTURE

llama.cpp update breaks Open WebUI web search

This Reddit post reports a regression after a recent llama.cpp backend update: web-search tool calling no longer works for Qwen 3.6 27B in Open WebUI, while the same GGUF quant and setup reportedly worked before. The issue reads like a backend/runtime compatibility break rather than a model change, so the likely suspects are tool-call formatting, sampling behavior, or a server-side change in how function-call outputs are emitted.

// ANALYSIS

Hot take: this looks more like a regression in the llama.cpp serving layer than a Qwen model failure.

  • The report ties the breakage to a backend update, with no other config changes mentioned.
  • The failure mode is specific to tool calling, which usually points to output formatting or protocol handling.
  • Open WebUI is the visible client, but the triggering change seems to be in llama.cpp.
  • The GGUF quantization noted here matters less than the inference/runtime behavior if the break is reproducible across the same model file.
  • Worth checking whether the regression appears only on web-search tools or affects all function/tool calls.
// TAGS
llama-cppqwenopen-webuitool-callingweb-searchregressiongguf

DISCOVERED

45d ago

2026-04-28

PUBLISHED

45d ago

2026-04-28

RELEVANCE

7/ 10

AUTHOR

Big_Mix_4044