BACK_TO_FEEDAICRIER_2
llama.cpp fixes XML tool-call ordering
OPEN_SOURCE ↗
REDDIT · REDDIT// 32d agoPRODUCT UPDATE

llama.cpp fixes XML tool-call ordering

llama.cpp has merged an autoparser fix that reshuffles optional arguments for XML-like tagged tool calls, addressing a bug where models such as Qwen3.5 and Qwen3-Coder-Next could get stuck failing or looping on tools like `read_file`. For developers running local coding agents, it should make long-context tool use noticeably more reliable without changing tool definitions.

// ANALYSIS

This is exactly the kind of unglamorous parser fix that makes agent stacks feel less brittle in real use. Local inference frameworks now win or lose on tool-call reliability as much as raw model quality.

  • The underlying issue was argument-order rigidity: if a model emitted `limit` before `offset`, the grammar could reject the call even when the intent was correct
  • The failure showed up most clearly on file-reading tools with multiple optional params, where the assistant would retry broken calls and fall into loops
  • Because the change lands in llama.cpp's autoparser layer, downstream apps using its server-side tool calling should benefit without redesigning schemas
  • It is a useful reminder that practical agent UX depends on protocol tolerance and parser ergonomics, not just better benchmarks
// TAGS
llama-cppllminferenceopen-sourceagent

DISCOVERED

32d ago

2026-03-10

PUBLISHED

36d ago

2026-03-06

RELEVANCE

8/ 10

AUTHOR

ilintar