llama.cpp vision needs mmproj, multimodal CLI

// 90d agoTUTORIAL

llama.cpp vision needs mmproj, multimodal CLI

This Reddit thread is a practical guide to getting Qwen3.5-4B GGUF vision working in llama.cpp. The poster found that the separate mmproj projector and the multimodal server path work, while plain llama-cli did not.

// ANALYSIS

The key correction is that llama.cpp multimodal support is not exposed through plain llama-cli; the working path uses the multimodal CLI or llama-server with a separate mmproj file. The post is useful as a real-world troubleshooting note, and the 20 tokens/sec question reads more like a configuration and benchmarking issue than a model limitation.

// TAGS

llama-cppqwen3.5multimodalvisionggufmmprojlocal-llminferenceperformance

DISCOVERED

90d ago

2026-04-20

PUBLISHED

90d ago

2026-04-20

RELEVANCE

7/ 10

AUTHOR

Dabber43

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO4h ago

Croc simplifies end-to-end encrypted file transfers

Croc is an open-source Go-based CLI tool that simplifies end-to-end encrypted file and folder transfers using password-authenticated key exchange. It supports resuming interrupted transfers, directory sharing, and NAT traversal via public or self-hosted relays.

RESEARCH5h ago

AI advice collapses willingness to admit ignorance

A study on human-AI collaboration found that access to AI advice severely impairs metacognition, collapsing participants' willingness to say "I don't know" from 44% to 3%. Using Step 3.5 Flash as a benchmark, researchers observed accuracy drop from 27% to 9% while confidence rose from 30% to 76%, even when accuracy was financially incentivized.

MODEL5h ago

GPT-5.6 fuels six major mathematical breakthroughs

Within a week of its launch, OpenAI's GPT-5.6 has reportedly contributed to nearly six mathematical breakthroughs, highlighting the rapid escalation of AI capabilities in solving complex mathematical problems. This marks a significant shift from December 2025, when AI first solved an obscure mathematical problem, to the present state where every new OpenAI model release is expected to yield dozens of major mathematical solutions accessible to the public.