BACK_TO_FEEDAICRIER_2
Gemma 4 fails multi-turn tool calling
OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoNEWS

Gemma 4 fails multi-turn tool calling

Local LLM users report that Gemma 4's native function calling abruptly terminates generation during complex, multi-turn tool sequences. While initial tool calls succeed, the model fails to continue after responding to the user when subsequent tools are required.

// ANALYSIS

Gemma 4's native function calling is a welcome addition, but its brittle execution in local environments highlights the gap between hosted APIs and open-weight models.

  • The bug triggers specifically during extended agentic loops (think -> tool -> respond -> think -> tool), causing an immediate generation halt.
  • This severely limits the model's viability for autonomous workflows that require continuous, multi-step reasoning and interaction.
  • The failure likely stems from how local inference engines or quantization formats parse the model's new thinking and tool-use tokens.
  • Fixes will likely require client-side updates in inference frameworks like llama.cpp or Ollama to properly handle the token streams.
// TAGS
gemmagemma-4llmagentopen-weights

DISCOVERED

3h ago

2026-04-15

PUBLISHED

4h ago

2026-04-15

RELEVANCE

8/ 10

AUTHOR

dampflokfreund