Qwen3.5-9B GGUF targets reasoning, tool-use
A community-tuned Qwen3.5-9B model optimized for reasoning and tool-use via Opus-4.6 and FunctionGemma datasets. Now available in GGUF format for efficient local inference in llama.cpp, Ollama, and LM Studio.
This release bridges the gap between small, fast local models and the complex reasoning capabilities typically reserved for larger frontier models.
- –Hybrid fine-tuning on Opus-4.6 Reasoning and Google's Mobile-Actions datasets significantly improves structured output reliability
- –9B parameter count provides a "sweet spot" for 8GB VRAM consumer GPUs while maintaining high instruction-following accuracy
- –Native GGUF support ensures immediate compatibility with the local LLM ecosystem without custom implementation
- –The focus on "action-oriented" prompting makes it an ideal candidate for local autonomous agents and home automation tasks
- –Quantized at Q4_K_M (5.6GB), it fits into almost any modern development environment with minimal overhead
DISCOVERED
72d ago
2026-03-18
PUBLISHED
72d ago
2026-03-17
RELEVANCE
AUTHOR
RiverRatt
