Qwen3.5-35B-A3B MLX Ports Keep Crashing
The thread is really about Qwen3.5-35B-A3B on MLX: the user says GGUF builds feel stable, while MLX builds in LM Studio crash and keep leaking `<think>` output. They’re asking whether a newer MLX conversion or a template tweak can make the Mac-native path behave as well as the GGUF ports.
Inference: this looks more like an MLX conversion/runtime mismatch than a broken model family.
- –[Qwen3.5’s official model card](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) says the model thinks by default and that direct responses require `chat_template_kwargs: {"enable_thinking": False}`.
- –Fresh MLX conversions do exist, including [NexVeridian/Qwen3.5-35B-A3B-4bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-4bit) and [NexVeridian/Qwen3.5-35B-A3B-6bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-6bit), both converted from the official model with newer `mlx-lm` releases.
- –The Reddit replies point toward newer quantizations like MXFP4-community, which suggests the practical fix is probably “fresh conversion + correct template” rather than a prompt-only hack: [discussion](https://www.reddit.com/r/LocalLLaMA/comments/1rwge3s/is_there_a_good_version_of_qwen3530ba3b_for_mlx/).
DISCOVERED
71d ago
2026-03-17
PUBLISHED
71d ago
2026-03-17
RELEVANCE
AUTHOR
Snorty-Pig