YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Gemma 4 26B posts strong R9700

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Gemma 4 26B posts strong R9700
OPEN LINK ↗
// 48d agoBENCHMARK RESULT

Gemma 4 26B posts strong R9700

This Reddit benchmark rerun shows Gemma 4 26B quantized GGUF running well on an AMD Radeon AI Pro R9700, with Vulkan hitting about 2,949 tok/s on prompt processing and 92.9 tok/s on generation. The author corrected an earlier batch-size mistake, so these numbers are closer to a fair default-config comparison.

// ANALYSIS

Local Gemma 4 inference on AMD looks backend-sensitive: on this card, Vulkan materially outpaced ROCm in prefill, while decode stayed strong but closer together. That makes the result useful less as a universal Gemma score and more as a signal that the runtime stack can dominate real-world throughput.

  • Vulkan beat ROCm on this setup by a wide margin in prompt processing: 2,949 vs 1,422 tok/s at `pp1000`, and 1,450 vs 681 tok/s at `pp1000 @ d50000`.
  • Generation speed was also higher under Vulkan, but by a smaller gap: 92.9 vs 70.9 tok/s at `tg2500`, narrowing to 78.2 vs 61.5 tok/s at the longest context.
  • The test was run on a 210W power cap with ROCm 7.2, so the result reflects both software maturity and power-policy constraints, not just raw GPU capability.
  • For people trying to run 26B-class open models locally, this is a reminder to benchmark the whole stack: driver, backend, quantization format, and batch settings all matter.
// TAGS
gemma-4llmbenchmarkgpuinferenceopen-weights

DISCOVERED

48d ago

2026-04-10

PUBLISHED

48d ago

2026-04-09

RELEVANCE

9/ 10

AUTHOR

ProfessionalSpend589