OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoINFRASTRUCTURE
OpenClaw stumbles on local Qwen3.5
A Reddit user is trying to run Jackrong’s Qwen3.5-9B Claude-4.6 Opus reasoning distill through Ollama and OpenClaw for coding agents, but the local setup keeps collapsing mid-response. The trouble seems to be the Modelfile/GGUF import path plus agent-side streaming and tool-call instability, not just raw inference speed.
// ANALYSIS
This looks less like a bad model and more like a brittle agent integration stack. Reasoning distills, Ollama imports, and OpenClaw’s tool-calling path all have to line up, and a tiny mismatch can make a fast GPU feel broken.
- –Ollama can import GGUFs directly with `FROM ./model.gguf` and `ollama create`; the Modelfile is the blueprint, not a separate artifact you need to hunt down online.
- –Jackrong’s model card is built around structured `<think>...</think>` reasoning, so template or stop-token mismatches can derail output quickly.
- –OpenClaw only surfaces models that report tool support during auto-discovery, so a custom Qwen distill can look "missing" or brittle even when Ollama itself is fine.
- –OpenClaw’s guide also recommends tool-capable models like Qwen 2.5 or Llama 3.3, which suggests a reasoning-distilled Qwen3.5 variant may not be the most forgiving agent backend.
- –The "compute spikes instead of staying solid" symptom reads like bursty retries or context thrash, not steady token generation.
// TAGS
openclawqwenollamallmagentai-codinginferenceself-hosted
DISCOVERED
17d ago
2026-03-26
PUBLISHED
17d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
AngstyGlitter2