llama.cpp tool calls break with Qwen3.5

// 124d agoINFRASTRUCTURE

llama.cpp tool calls break with Qwen3.5

A Reddit report says recent `llama-server` runs can still handle plain OpenAI-compatible chat with Qwen3.5, but fail on tool calls with an automatic parser-generation error tied to the Jinja chat template. Related llama.cpp GitHub issues suggest this is part of a broader pattern of Qwen tool-calling brittleness in the project’s parser and template stack rather than a simple user misconfiguration.

// ANALYSIS

This looks more like a llama.cpp tool-calling regression than a knock on Qwen itself, especially since the same setup reportedly works through Ollama.

–The reported failure happens at the parser/template layer, with `Unexpected message role` thrown while generating the tool-call parser for `/v1/responses`
–A separate March 2026 llama.cpp issue documents Qwen3.5 repeatedly failing tool calls under longer context, which points to a real ecosystem bug cluster rather than one broken local setup
–An older 2025 llama.cpp issue shows similar Qwen tool-calling trouble around Jinja templates and missing `content` keys, so this is not a brand-new edge case
–For local-agent developers, this is the painful kind of bug because basic chat still works and can hide the fact that function calling is silently degraded
–The practical takeaway is that Qwen-on-llama.cpp remains powerful, but tool use is still sensitive to template, parser, and server-version drift

// TAGS

llama-cppqwen3-5llminferenceapiagentopen-source

DISCOVERED

124d ago

2026-03-10

PUBLISHED

126d ago

2026-03-09

RELEVANCE

7/ 10

AUTHOR

chibop1

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS45m ago

Tiny Army, Eyas win Build Small hackathon

Cohere co-sponsored Hugging Face's 'Build Small' hackathon, which challenged developers to create useful, whimsical, or cool applications using smaller, more efficient AI models. Two projects powered by Cohere's models received awards: 'Tiny Army,' an interactive game by @polats where players describe and create their own heroes, won second place on the Thousand-Token Wood track; and 'Eyas,' a security camera agent built by Hanhee Lee, Javier Huang, and Joe Lee to solve real-world security needs for a family convenience store, won the Best Agent award.

LAUNCH53m ago

Netlify enables one-click deploys in Claude

Netlify has partnered with Anthropic to bring direct, one-click deployments to Claude, allowing users to ship Claude-designed web applications straight to production by typing "Deploy to Netlify" in Claude chat. This integration removes the friction of manual exports and re-uploads, and also supports pairing Claude Code with Netlify Agent Runners to add databases, authentication, and serverless functions.

NEWS1h ago

Claude Code bug secretly reverts model to Opus

A developer highlighted an ongoing issue with Anthropic's Claude Code where the application fails to persist the user's preferred model selection. Specifically, the tool repeatedly switches back to the Claude 3 Opus model on its own, ignoring the user's explicit choice to use Claude 3.5 Fable, and requires frequent manual intervention to correct.