BACK_TO_FEEDAICRIER_2
llama.cpp update breaks Open WebUI web search
OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoINFRASTRUCTURE

llama.cpp update breaks Open WebUI web search

This Reddit post reports a regression after a recent llama.cpp backend update: web-search tool calling no longer works for Qwen 3.6 27B in Open WebUI, while the same GGUF quant and setup reportedly worked before. The issue reads like a backend/runtime compatibility break rather than a model change, so the likely suspects are tool-call formatting, sampling behavior, or a server-side change in how function-call outputs are emitted.

// ANALYSIS

Hot take: this looks more like a regression in the llama.cpp serving layer than a Qwen model failure.

  • The report ties the breakage to a backend update, with no other config changes mentioned.
  • The failure mode is specific to tool calling, which usually points to output formatting or protocol handling.
  • Open WebUI is the visible client, but the triggering change seems to be in llama.cpp.
  • The GGUF quantization noted here matters less than the inference/runtime behavior if the break is reproducible across the same model file.
  • Worth checking whether the regression appears only on web-search tools or affects all function/tool calls.
// TAGS
llama-cppqwenopen-webuitool-callingweb-searchregressiongguf

DISCOVERED

3h ago

2026-04-28

PUBLISHED

7h ago

2026-04-28

RELEVANCE

7/ 10

AUTHOR

Big_Mix_4044