BACK_TO_FEEDAICRIER_2
Gemma 4 fixes hit llama.cpp, Google updates templates
OPEN_SOURCE ↗
REDDIT · REDDIT// 1d agoPRODUCT UPDATE

Gemma 4 fixes hit llama.cpp, Google updates templates

Google has released updated Jinja chat templates for the Gemma 4 model family to address critical tool-calling failures. Simultaneously, llama.cpp has merged a fix for the reasoning budget sampler, enabling proper local support for the model's native "thinking" capabilities.

// ANALYSIS

Gemma 4's reasoning capabilities are finally becoming usable in local environments, but the "broken" state of initial GGUFs means manual intervention is still required for most users. New chat templates are mandatory for 31B, 26B, and "E" variants to fix tool-calling transitions, while llama.cpp PR #21697 correctly implements reasoning budget support by populating missing thinking tags. Vision performance can be optimized by manually tuning token limits, and higher temperatures up to 1.5 are reportedly improving one-shot coding performance. Manual template overrides via --chat-template-file remain necessary unless models are re-quantized with the April 9th metadata updates.

// TAGS
gemma-4llama-cppllmreasoningopen-weightsgoogleai-coding

DISCOVERED

1d ago

2026-04-10

PUBLISHED

1d ago

2026-04-10

RELEVANCE

9/ 10

AUTHOR

andy2na