Qwen3.6-35B-A3B RFT goes open source
This project fine-tunes Qwen3.6-35B-A3B on its own generated coding outputs, with the main challenge being DeltaNet-specific LoRA targeting instead of standard q/k/v attention hooks. The author released the bf16 model, a 6-bit MLX build, the LoRA adapter, and the pipeline code, but the reported coding gains look marginal.
The release is more compelling as a reproducible training pipeline than as a clear model improvement story. The real value is the DeltaNet adapter work, which is the part most LoRA recipes will get wrong on this architecture.
- –Qwen3.6-35B-A3B uses gated DeltaNet in most layers, so standard attention-targeted LoRA recipes barely touch the model
- –The post is useful because it names the actual parameter paths that matter, which is the sort of detail people only learn by breaking a run
- –The data pipeline is closer to rejection fine-tuning than pure SSD-style self-distillation, since only compiled, test-passing samples were kept
- –The benchmark result is weak evidence at best: 128/130 vs 126/130 on a tiny eval set is noise, not a win
- –Releasing the adapter plus MLX quantization makes this more actionable for Apple Silicon users than the raw benchmark does
DISCOVERED
2h ago
2026-05-07
PUBLISHED
6h ago
2026-05-07
RELEVANCE
AUTHOR
Snoo_27681