Gemma 4 31B Strains Apple Silicon

// 49d agoMODEL RELEASE

Gemma 4 31B Strains Apple Silicon

A Reddit user says Gemma 4 31B in Ollama idles around 53GB on a 64GB M1 Ultra Mac Studio and crashes on interaction, despite Ollama’s library showing about 20GB and Google’s model card listing 58.3GB for BF16. The post highlights the gap between quantized package size, runtime overhead, and unified-memory behavior on Apple Silicon.

// ANALYSIS

The memory figures differ by precision and runtime path: Ollama’s library entry reflects the packaged model, Google’s card lists BF16 and Q4_0 sizes, and the reported idle footprint likely includes buffers and cache on top of the load itself. The practical takeaway for Apple Silicon users is that unified memory can exhaust quickly, so smaller quantizations, shorter contexts, or smaller Gemma 4 variants are the safer choice.

// TAGS

gemma-4ollamaapple-siliconmacoslocal-llmquantizationunified-memoryllm

DISCOVERED

49d ago

2026-04-08

PUBLISHED

49d ago

2026-04-08

RELEVANCE

8/ 10

AUTHOR

TaylorHu

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE2h ago

Cursor adds dedicated subagents for skills

Cursor now allows developers to execute tool-heavy or research-intensive agent skills within dedicated subagents. This architectural shift isolates noisy background tasks, keeping the main chat context clean and focused.

UPDATE3h ago

YouTube moves AI labels to video player

YouTube is moving its AI content disclosures from video descriptions to more prominent placements beneath the player and on Shorts overlays. Starting in May, the platform will use internal signals to automatically label photorealistic AI content that creators fail to disclose.

OPEN SOURCE6h ago

Taste Skill kills AI "frontend slop"

Taste-Skill is an open-source framework that provides portable "agent skills" to enforce high-end design principles in AI-generated code. By injecting specific design directives and "anti-slop" rules, it enables LLMs to produce editorial-grade UIs that bypass generic, boilerplate-heavy AI templates.