BACK_TO_FEEDAICRIER_2
Google releases Gemma 4 under Apache 2.0
OPEN_SOURCE ↗
REDDIT · REDDIT// 8d agoMODEL RELEASE

Google releases Gemma 4 under Apache 2.0

Google DeepMind's Gemma 4 family lands with a move to Apache 2.0 and a 31B dense model that pushes reasoning benchmarks. While the architecture enables a 256K context window, early users are reporting massive KV cache memory overhead due to non-standard head dimensions.

// ANALYSIS

Gemma 4 is a strategic pivot for Google, trading memory efficiency for pure reasoning power and an open-source friendly license.

  • The 31B model's "heterogeneous" architecture uses 256 and 512 head dimensions, significantly larger than the 128 standard in models like Qwen or Llama.
  • KV cache quantization (4-bit/8-bit) is no longer optional for consumer hardware; it's a hard requirement to hit the advertised 256K context window.
  • The shift to Apache 2.0 is the real story, likely a move to reclaim market share from Meta's Llama in the developer ecosystem.
  • Early benchmarks put the 31B variant at #3 on the LMSYS Arena, effectively matching GPT-4o performance in a local-first package.
// TAGS
gemma-4googlellmopen-weightsopen-sourcekv-cachereasoning

DISCOVERED

8d ago

2026-04-03

PUBLISHED

8d ago

2026-04-03

RELEVANCE

10/ 10

AUTHOR

IngeniousIdiocy