Qwen 3.6 35B Genesis-V2 hits with APEX, MTP
Developer LuffyTheFox released a refined "Genesis-V2" build of Qwen 3.6 35B A3B, utilizing "numerical surgery" to fix architectural weight drift. The release introduces APEX quantization and native Multi-Token Prediction (MTP) for optimized local inference and stability.
This release demonstrates that open-weight models often require community "repair" to reach their full potential after being damaged by training or conversion drifts.
- –APEX quantization uses Wasserstein distance metrics to restore weight symmetry and fix saturation issues in the base model.
- –Native MTP support significantly increases throughput on compatible hardware by predicting multiple tokens in a single forward pass.
- –The 35B MoE architecture with 3B active parameters remains the "goldilocks" size for high-performance local AI on consumer GPUs.
- –Tester results indicate high reliability across the 262K context window, making it a viable candidate for complex long-context coding tasks.
- –Hybrid Gated DeltaNet and Softmax attention provides a scalable alternative to traditional quadratic Transformer overhead.
DISCOVERED
1h ago
2026-05-24
PUBLISHED
3h ago
2026-05-24
RELEVANCE
AUTHOR
EvilEnginer