BACK_TO_FEEDAICRIER_2
llama.cpp tool calls break with Qwen3.5
OPEN_SOURCE ↗
REDDIT · REDDIT// 32d agoINFRASTRUCTURE

llama.cpp tool calls break with Qwen3.5

A Reddit report says recent `llama-server` runs can still handle plain OpenAI-compatible chat with Qwen3.5, but fail on tool calls with an automatic parser-generation error tied to the Jinja chat template. Related llama.cpp GitHub issues suggest this is part of a broader pattern of Qwen tool-calling brittleness in the project’s parser and template stack rather than a simple user misconfiguration.

// ANALYSIS

This looks more like a llama.cpp tool-calling regression than a knock on Qwen itself, especially since the same setup reportedly works through Ollama.

  • The reported failure happens at the parser/template layer, with `Unexpected message role` thrown while generating the tool-call parser for `/v1/responses`
  • A separate March 2026 llama.cpp issue documents Qwen3.5 repeatedly failing tool calls under longer context, which points to a real ecosystem bug cluster rather than one broken local setup
  • An older 2025 llama.cpp issue shows similar Qwen tool-calling trouble around Jinja templates and missing `content` keys, so this is not a brand-new edge case
  • For local-agent developers, this is the painful kind of bug because basic chat still works and can hide the fact that function calling is silently degraded
  • The practical takeaway is that Qwen-on-llama.cpp remains powerful, but tool use is still sensitive to template, parser, and server-version drift
// TAGS
llama-cppqwen3-5llminferenceapiagentopen-source

DISCOVERED

32d ago

2026-03-10

PUBLISHED

33d ago

2026-03-09

RELEVANCE

7/ 10

AUTHOR

chibop1