YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Qwen3.5-35B-A3B MLX Ports Keep Crashing

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Qwen3.5-35B-A3B MLX Ports Keep Crashing
OPEN LINK ↗
// 71d agoOPENSOURCE RELEASE

Qwen3.5-35B-A3B MLX Ports Keep Crashing

The thread is really about Qwen3.5-35B-A3B on MLX: the user says GGUF builds feel stable, while MLX builds in LM Studio crash and keep leaking `<think>` output. They’re asking whether a newer MLX conversion or a template tweak can make the Mac-native path behave as well as the GGUF ports.

// ANALYSIS

Inference: this looks more like an MLX conversion/runtime mismatch than a broken model family.

  • [Qwen3.5’s official model card](https://huggingface.co/Qwen/Qwen3.5-35B-A3B) says the model thinks by default and that direct responses require `chat_template_kwargs: {"enable_thinking": False}`.
  • Fresh MLX conversions do exist, including [NexVeridian/Qwen3.5-35B-A3B-4bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-4bit) and [NexVeridian/Qwen3.5-35B-A3B-6bit](https://huggingface.co/NexVeridian/Qwen3.5-35B-A3B-6bit), both converted from the official model with newer `mlx-lm` releases.
  • The Reddit replies point toward newer quantizations like MXFP4-community, which suggests the practical fix is probably “fresh conversion + correct template” rather than a prompt-only hack: [discussion](https://www.reddit.com/r/LocalLLaMA/comments/1rwge3s/is_there_a_good_version_of_qwen3530ba3b_for_mlx/).
// TAGS
qwen3-5-35b-a3bllmopen-sourceopen-weightsself-hostedinferencereasoning

DISCOVERED

71d ago

2026-03-17

PUBLISHED

71d ago

2026-03-17

RELEVANCE

8/ 10

AUTHOR

Snorty-Pig