llama.cpp vision needs mmproj, multimodal CLI
This Reddit thread is a practical guide to getting Qwen3.5-4B GGUF vision working in llama.cpp. The poster found that the separate mmproj projector and the multimodal server path work, while plain llama-cli did not.
The key correction is that llama.cpp multimodal support is not exposed through plain llama-cli; the working path uses the multimodal CLI or llama-server with a separate mmproj file. The post is useful as a real-world troubleshooting note, and the 20 tokens/sec question reads more like a configuration and benchmarking issue than a model limitation.
DISCOVERED
45d ago
2026-04-20
PUBLISHED
45d ago
2026-04-20
RELEVANCE
AUTHOR
Dabber43