llama.cpp fixes XML tool-call ordering

// 123d agoPRODUCT UPDATE

llama.cpp fixes XML tool-call ordering

llama.cpp has merged an autoparser fix that reshuffles optional arguments for XML-like tagged tool calls, addressing a bug where models such as Qwen3.5 and Qwen3-Coder-Next could get stuck failing or looping on tools like `read_file`. For developers running local coding agents, it should make long-context tool use noticeably more reliable without changing tool definitions.

// ANALYSIS

This is exactly the kind of unglamorous parser fix that makes agent stacks feel less brittle in real use. Local inference frameworks now win or lose on tool-call reliability as much as raw model quality.

–The underlying issue was argument-order rigidity: if a model emitted `limit` before `offset`, the grammar could reject the call even when the intent was correct
–The failure showed up most clearly on file-reading tools with multiple optional params, where the assistant would retry broken calls and fall into loops
–Because the change lands in llama.cpp's autoparser layer, downstream apps using its server-side tool calling should benefit without redesigning schemas
–It is a useful reminder that practical agent UX depends on protocol tolerance and parser ergonomics, not just better benchmarks

// TAGS

llama-cppllminferenceopen-sourceagent

DISCOVERED

123d ago

2026-03-10

PUBLISHED

127d ago

2026-03-06

RELEVANCE

8/ 10

AUTHOR

ilintar

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

INFRA23m ago

Ritual builds infrastructure for autonomous AI agents

Ritual is an AI lab and infrastructure project that aims to move beyond simply making AI models smarter by focusing on granting them autonomous agency. The project is developing the underlying stack—including cryptography, consensus, and privacy mechanisms—required for AI agents to operate persistently, hold and spend their own money, and execute tasks without needing manual human approval for every action.

OPEN SOURCE1h ago

OpenDisplay turns iOS devices into Mac monitors

OpenDisplay is an open-source utility that streams macOS desktops to iPads or iPhones over USB or Wi-Fi, turning them into low-latency, high-resolution external monitors. Leveraging macOS's private CGVirtualDisplay API, ScreenCaptureKit, and VideoToolbox, it integrates directly into macOS Display settings as a true extended display without needing external servers or telemetry.

OPEN SOURCE1h ago

NASA releases SpaceWasm flight WebAssembly interpreter

spacewasm is a WebAssembly interpreter developed by NASA and Caltech for safety-critical flight software. Written in Rust, it decodes Wasm modules in a single pass into an optimized intermediate representation and utilizes a custom memory model with fixed-size allocation pages to guarantee deterministic execution and avoid memory panics in resource-constrained embedded systems.