BACK_TO_FEEDAICRIER_2
Qwen3.5-35B-A3B MLX Ports Keep Crashing
OPEN_SOURCE ↗
REDDIT · REDDIT// 25d agoOPENSOURCE RELEASE

Qwen3.5-35B-A3B MLX Ports Keep Crashing

The thread is really about Qwen3.5-35B-A3B on MLX: the user says GGUF builds feel stable, while MLX builds in LM Studio crash and keep leaking `<think>` output. They’re asking whether a newer MLX conversion or a template tweak can make the Mac-native path behave as well as the GGUF ports.

// ANALYSIS

Inference: this looks more like an MLX conversion/runtime mismatch than a broken model family.

  • [Qwen3.5’s official model card](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) says the model thinks by default and that direct responses require `chat_template_kwargs: {"enable_thinking": False}`.
  • Fresh MLX conversions do exist, including [NexVeridian/Qwen3.5-35B-A3B-4bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-4bit) and [NexVeridian/Qwen3.5-35B-A3B-6bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-6bit), both converted from the official model with newer `mlx-lm` releases.
  • The Reddit replies point toward newer quantizations like MXFP4-community, which suggests the practical fix is probably “fresh conversion + correct template” rather than a prompt-only hack: [discussion](https://www.reddit.com/r/LocalLLaMA/comments/1rwge3s/is_there_a_good_version_of_qwen3530ba3b_for_mlx/).
// TAGS
qwen3-5-35b-a3bllmopen-sourceopen-weightsself-hostedinferencereasoning

DISCOVERED

25d ago

2026-03-17

PUBLISHED

25d ago

2026-03-17

RELEVANCE

8/ 10

AUTHOR

Snorty-Pig