Qwen3.5 local runs hit llama.cpp gibberish bug
A LocalLLaMA user reports that Qwen3.5-9B and 27B GGUF quants produce gibberish from first prompt on Windows with llama.cpp b8204, while a smaller Linux CPU setup can run at least one 9B quant correctly. The thread points to a broader community pattern of unstable outputs in early Qwen3.5 local deployments, suggesting runtime/config compatibility issues rather than a simple prompt-quality problem.
This looks less like “bad model quality” and more like ecosystem friction right after a fast model rollout.
- –The failure reproducing across multiple Qwen3.5 sizes on one machine, but not another, is a classic signal of backend/runtime mismatch.
- –Similar same-week reports in LocalLLaMA suggest a cluster of inference-stack issues (context handling, templates, or quant/runtime interactions).
- –Existing older models working on the same Windows box narrows suspicion to Qwen3.5-specific serving behavior rather than general hardware instability.
- –For AI developers, the practical takeaway is to treat first-week local model releases as integration events, not just model swaps.
DISCOVERED
96d ago
2026-03-05
PUBLISHED
96d ago
2026-03-05
RELEVANCE
AUTHOR
jpbras