OPEN_SOURCE ↗
REDDIT · REDDIT// 3d agoMODEL RELEASE
Dev fixes silent weight drift in Qwen3.5 35B
A developer discovered and fixed a silent tensor scaling bug in the uncensored Qwen3.5 35B A3B model that destroyed context during long conversations. By recalibrating just two corrupted weights, the fix restores coherence and code generation capabilities.
// ANALYSIS
This proves that seemingly broken open-weight models often just need targeted weight surgery rather than complete retraining.
- –The bug exposes a dangerous edge case with AdamW optimizers in hybrid architectures, where rare experts suffer from massive weight drift.
- –Modifying just two tensors in a 64GB model achieved an 88.6% error reduction, highlighting the fragility of late-layer weights.
- –Developers training MoE and recurrent hybrids should manually inspect final block scales to prevent similar silent degradation.
// TAGS
qwen3.5llmopen-weightsfine-tuningresearch
DISCOVERED
3d ago
2026-04-08
PUBLISHED
3d ago
2026-04-08
RELEVANCE
8/ 10
AUTHOR
EvilEnginer