OPEN_SOURCE ↗
REDDIT · REDDIT// 8d agoBENCHMARK RESULT
Gemma-4 26B-A4B runs well on M5 MacBook
A Reddit test says Gemma 4 26B-A4B running through opencode on a 32GB M5 MacBook Air is fast enough for real local agentic coding, with roughly 300 tokens/sec prompt processing, 12 tokens/sec generation, and about 8W power draw. The poster says that makes it the first laptop setup they’ve used that stays cool, quiet, and usable away from a wall plug.
// ANALYSIS
Local LLM coding is crossing the line from novelty to practical tool here, but the model still sounds like a strong assistant rather than a drop-in replacement for frontier cloud agents.
- –The standout metric is prompt throughput: fast prefill matters more than raw generation speed when you’re doing repo-scale agent workflows.
- –Low-power, low-heat operation changes the actual UX; a laptop that can run an agent without spinning fans is a different class of machine for mobile dev work.
- –The report is still a reminder that open models lag on autonomy and instruction quality, even when the hardware is finally good enough to support them.
- –For developers, the interesting shift is not “can local models code?” but “can they stay useful long enough, cheaply enough, and quietly enough to fit real work sessions?”
- –Opencode looks like a decent stress test for this trend because it exposes whether local inference can handle multi-step, tool-using behavior instead of just chat.
// TAGS
gemma-4-26b-a4bopencodellmai-codingagentclibenchmark
DISCOVERED
8d ago
2026-04-03
PUBLISHED
9d ago
2026-04-03
RELEVANCE
9/ 10
AUTHOR
maddie-lovelace