YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Gemma 4 fixes hit llama.cpp, Google updates templates

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Gemma 4 fixes hit llama.cpp, Google updates templates
OPEN LINK ↗
// 47d agoPRODUCT UPDATE

Gemma 4 fixes hit llama.cpp, Google updates templates

Google has released updated Jinja chat templates for the Gemma 4 model family to address critical tool-calling failures. Simultaneously, llama.cpp has merged a fix for the reasoning budget sampler, enabling proper local support for the model's native "thinking" capabilities.

// ANALYSIS

Gemma 4's reasoning capabilities are finally becoming usable in local environments, but the "broken" state of initial GGUFs means manual intervention is still required for most users. New chat templates are mandatory for 31B, 26B, and "E" variants to fix tool-calling transitions, while llama.cpp PR #21697 correctly implements reasoning budget support by populating missing thinking tags. Vision performance can be optimized by manually tuning token limits, and higher temperatures up to 1.5 are reportedly improving one-shot coding performance. Manual template overrides via --chat-template-file remain necessary unless models are re-quantized with the April 9th metadata updates.

// TAGS
gemma-4llama-cppllmreasoningopen-weightsgoogleai-coding

DISCOVERED

47d ago

2026-04-10

PUBLISHED

47d ago

2026-04-10

RELEVANCE

9/ 10

AUTHOR

andy2na