Qwen3.6 MLX port trims refusals on Macs
This is an MLX release of an abliterated Qwen3.6-35B-A3B variant, built from a Heretic source checkpoint and quantized to 4-bit for local Apple Silicon deployment. The model card says it keeps the base model’s reasoning and instruction-following profile while removing refusal behavior at the weight level, and it was validated with short chat, reasoning, and code smoke tests.
This is less about benchmark theatrics and more about a practical local-chat stack for Apple Silicon users who want a large MoE model that runs fast on Macs. The MLX packaging is the main value, the abliterated positioning matters if you want fewer refusals, the 4/6-bit layer-aware quantization suggests more care than a flat 4-bit pass, and the validation is light enough that the "best chatbot" claim should still be treated as anecdotal.
DISCOVERED
23h ago
2026-05-08
PUBLISHED
1d ago
2026-05-08
RELEVANCE
AUTHOR
eclipsegum