OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoNEWS
Gemma 4 fails multi-turn tool calling
Local LLM users report that Gemma 4's native function calling abruptly terminates generation during complex, multi-turn tool sequences. While initial tool calls succeed, the model fails to continue after responding to the user when subsequent tools are required.
// ANALYSIS
Gemma 4's native function calling is a welcome addition, but its brittle execution in local environments highlights the gap between hosted APIs and open-weight models.
- –The bug triggers specifically during extended agentic loops (think -> tool -> respond -> think -> tool), causing an immediate generation halt.
- –This severely limits the model's viability for autonomous workflows that require continuous, multi-step reasoning and interaction.
- –The failure likely stems from how local inference engines or quantization formats parse the model's new thinking and tool-use tokens.
- –Fixes will likely require client-side updates in inference frameworks like llama.cpp or Ollama to properly handle the token streams.
// TAGS
gemmagemma-4llmagentopen-weights
DISCOVERED
3h ago
2026-04-15
PUBLISHED
4h ago
2026-04-15
RELEVANCE
8/ 10
AUTHOR
dampflokfreund