Qwen-ssm-repair fixes Qwen 3.5 weight drift
The qwen-ssm-repair utility corrects numerical weight drift in Qwen 3.5 Gated Delta Network layers that causes context collapse at 75k+ tokens. Using statistical outlier detection and surgical alpha-scaling, it restores model stability without requiring expensive retraining or full-model fine-tuning.
The Qwen 3.5 weight drift illustrates how architectural complexity in hybrid SSM-Transformer models can lead to subtle failure modes that bypass standard benchmarks like NIAH. Surgical patching using statistical anomaly detection and mmap in-place patching provides a fast, cost-effective alternative to expensive fine-tuning, especially for local LLM users facing quantization-induced variance in sensitive SSM layers.
DISCOVERED
7h ago
2026-04-12
PUBLISHED
8h ago
2026-04-12
RELEVANCE
AUTHOR
Decivox