Gemma 4 26B posts strong R9700

// 61d agoBENCHMARK RESULT

Gemma 4 26B posts strong R9700

This Reddit benchmark rerun shows Gemma 4 26B quantized GGUF running well on an AMD Radeon AI Pro R9700, with Vulkan hitting about 2,949 tok/s on prompt processing and 92.9 tok/s on generation. The author corrected an earlier batch-size mistake, so these numbers are closer to a fair default-config comparison.

// ANALYSIS

Local Gemma 4 inference on AMD looks backend-sensitive: on this card, Vulkan materially outpaced ROCm in prefill, while decode stayed strong but closer together. That makes the result useful less as a universal Gemma score and more as a signal that the runtime stack can dominate real-world throughput.

–Vulkan beat ROCm on this setup by a wide margin in prompt processing: 2,949 vs 1,422 tok/s at `pp1000`, and 1,450 vs 681 tok/s at `pp1000 @ d50000`.
–Generation speed was also higher under Vulkan, but by a smaller gap: 92.9 vs 70.9 tok/s at `tg2500`, narrowing to 78.2 vs 61.5 tok/s at the longest context.
–The test was run on a 210W power cap with ROCm 7.2, so the result reflects both software maturity and power-policy constraints, not just raw GPU capability.
–For people trying to run 26B-class open models locally, this is a reminder to benchmark the whole stack: driver, backend, quantization format, and batch settings all matter.

// TAGS

gemma-4llmbenchmarkgpuinferenceopen-weights

DISCOVERED

61d ago

2026-04-10

PUBLISHED

61d ago

2026-04-09

RELEVANCE

9/ 10

AUTHOR

ProfessionalSpend589

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL30m ago

Claude Fable 5 drives rapid autonomous project development

Following the public launch of Anthropic's Claude Fable 5, developer showcase account Toolfolio curated a compilation of the most impressive, "wild" projects built by the community in under 16 hours. As a "Mythos-class" model designed for sustained, multi-step agentic workflows and software engineering, Claude Fable 5's release has spurred developers to quickly build functional web applications, game solvers, and automated tools, highlighting the model's high autonomy and speed.

NEWS39m ago

Claude Code Fable 5 triggers billing warnings

Developer Daniel Avila flagged a potential issue in Anthropic's Claude Code CLI when selecting the newly released Claude Fable 5 model, noting that he received billing warnings despite Anthropic's promotion offering free access to the model until June 23, 2026. The issue likely stems from a conflict in how the CLI manages authentication, as the free promotional period is restricted to subscription plan logins (Pro, Max, Team, Enterprise) and does not apply if the tool detects a direct ANTHROPIC_API_KEY environment variable, which bills the user immediately.

TUTORIAL40m ago

Claude Fable tutorial builds MotionSites animated websites

A new twelve-minute tutorial by Viktor Oddy demonstrates how to build animated, award-winning websites using Claude Fable 5. The workflow leverages a library of pre-designed motion prompts from MotionSites to generate frontend components without manual coding.

Gemma 4 26B posts strong R9700