OPEN_SOURCE ↗
REDDIT · REDDIT// 7d agoNEWS
Gemma 4 E2B-it GGUFs hit shape mismatch
Users trying to load Gemma-4-E2B-IT GGUFs in llama.cpp are hitting tensor-shape errors on startup, even after redownloading and using recent builds. The failure points to a bad or mismatched conversion rather than a simple VRAM or `-ngl` problem.
// ANALYSIS
This looks like a launch-week compatibility bug in the GGUF ecosystem, not a hardware limitation. If the tensor layout in the file does not match what llama.cpp expects, no amount of extra GPU memory will make it load.
- –The error shows `blk.2.attn_q.weight` with the wrong dimensions, which is classic evidence of a model-format mismatch or stale conversion
- –Community chatter suggests some Gemma-4-E2B-IT GGUFs were broken or outdated, while reworked quants from other sources load successfully
- –Gemma-4-E2B-IT is being pushed hard for local multimodal use, so loader correctness matters as much as benchmark quality
- –For developers, the practical fix is to swap to a freshly rebuilt Gemma-4-E2B-IT GGUF and verify the exact quant/source, not just the llama.cpp version
- –This is a reminder that open-weight model releases can still be fragile at the packaging layer even when the model itself is fine
// TAGS
gemma-4-e2b-itllama-cppllmopen-sourceinferencegpu
DISCOVERED
7d ago
2026-04-04
PUBLISHED
8d ago
2026-04-04
RELEVANCE
8/ 10
AUTHOR
Ready-Ad4340