YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Gemma-4-26B-A4B-it-NVFP4 runs on vLLM via community patch

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Gemma-4-26B-A4B-it-NVFP4 runs on vLLM via community patch
OPEN LINK ↗
// 51d agoOPENSOURCE RELEASE

Gemma-4-26B-A4B-it-NVFP4 runs on vLLM via community patch

Developers have successfully deployed the Gemma-4-26B-A4B-it-NVFP4 model on vLLM by applying a custom Python patch that resolves weight scale mapping issues for Mixture-of-Experts (MoE) layers. The implementation leverages the Marlin backend for optimized performance on NVIDIA Blackwell (SM 12.1) hardware, specifically utilizing NVFP4 quantization and FP8 KV cache for high-efficiency local serving. This community-driven fix provides a critical bridge for running Google's latest open-weights model on vLLM ahead of official architectural support in the main repository.

// ANALYSIS

This successful run demonstrates the speed at which the open-source community adapts new architectures to existing inference frameworks.

  • The patch specifically fixes expert_params_mapping errors where dot-separated scale keys failed to map correctly to fused MoE parameters in the vLLM executor.
  • Mandatory use of the Marlin backend highlights the shift toward specialized kernels for Blackwell's SM 12.1 architecture in 4-bit (NVFP4) deployments.
  • The combination of NVFP4 and FP8 KV cache represents a significant leap in memory efficiency, enabling 26B-class models to run with higher parameter density on local hardware.
// TAGS
gemma-4-26b-a4b-it-nvfp4gemma-4vllmnvfp4blackwellmarlinmoequantizationopen-source

DISCOVERED

51d ago

2026-04-06

PUBLISHED

51d ago

2026-04-06

RELEVANCE

8/ 10

AUTHOR

NovelAdorable7033