Gemma 4 fails multi-turn tool calling

// 90d agoNEWS

Gemma 4 fails multi-turn tool calling

Local LLM users report that Gemma 4's native function calling abruptly terminates generation during complex, multi-turn tool sequences. While initial tool calls succeed, the model fails to continue after responding to the user when subsequent tools are required.

// ANALYSIS

Gemma 4's native function calling is a welcome addition, but its brittle execution in local environments highlights the gap between hosted APIs and open-weight models.

–The bug triggers specifically during extended agentic loops (think -> tool -> respond -> think -> tool), causing an immediate generation halt.
–This severely limits the model's viability for autonomous workflows that require continuous, multi-step reasoning and interaction.
–The failure likely stems from how local inference engines or quantization formats parse the model's new thinking and tool-use tokens.
–Fixes will likely require client-side updates in inference frameworks like llama.cpp or Ollama to properly handle the token streams.

// TAGS

gemmagemma-4llmagentopen-weights

DISCOVERED

90d ago

2026-04-15

PUBLISHED

90d ago

2026-04-15

RELEVANCE

8/ 10

AUTHOR

dampflokfreund

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

LAUNCH10m ago

World of AI Bench launches evaluation platform

World of AI Bench is an independent model evaluation and ranking platform designed to benchmark AI models on real-world developer tasks like WebGL rendering, frontend design, and agentic workflows. By shifting focus from vendor-curated academic datasets to custom workspaces and AI-judged evaluations, it helps teams measure model capabilities in practical scenarios.

NEWS10m ago

Gemini 3.5 Pro delayed, new models leak

Google's frontier model Gemini 3.5 Pro has reportedly been delayed for a third time, stalling the release of the company's next-generation reasoning model. In parallel, details from internal registrations have surfaced online, revealing that Google is preparing launches for Gemini 3.6 Flash and Gemini 3.5 Flash Light, suggesting an aggressive push to optimize and expand their faster, cost-efficient model lineup.

OPEN SOURCE44m ago

Pluto-Genesis tops 2,600 downloads

Developer Siddharth N.R. announced that their open-source language model project, Pluto-Genesis, has crossed 2,600 downloads. The developer credited OpenAI's GPT-5.6 as an essential AI engineering partner that assisted with debugging, fine-tuning, benchmarking, and accelerating the release process from the first training run to the final release.