OPEN_SOURCE ↗
REDDIT · REDDIT// 2h agoTUTORIAL
llama.cpp vision needs mmproj, multimodal CLI
This Reddit thread is a practical guide to getting Qwen3.5-4B GGUF vision working in llama.cpp. The poster found that the separate mmproj projector and the multimodal server path work, while plain llama-cli did not.
// ANALYSIS
The key correction is that llama.cpp multimodal support is not exposed through plain llama-cli; the working path uses the multimodal CLI or llama-server with a separate mmproj file. The post is useful as a real-world troubleshooting note, and the 20 tokens/sec question reads more like a configuration and benchmarking issue than a model limitation.
// TAGS
llama-cppqwen3.5multimodalvisionggufmmprojlocal-llminferenceperformance
DISCOVERED
2h ago
2026-04-20
PUBLISHED
4h ago
2026-04-20
RELEVANCE
7/ 10
AUTHOR
Dabber43