Moonshot AI debuts Attention Residuals architecture
Moonshot AI's Kimi Team has unveiled Attention Residuals, a novel architecture that replaces traditional static residual connections with depth-wise softmax attention. This allows each layer to selectively retrieve information from preceding layers, achieving a 1.25x compute efficiency gain and significant boosts in complex reasoning benchmarks.
Attention Residuals is the first serious rethink of the residual connection in a decade, replacing fixed addition with learned selectivity to prevent context loss in deep architectures. By utilizing Block Attention Residuals, the system maintains hardware efficiency with under 2% latency overhead while allowing models to autonomously organize internal pathways. Scaling experiments show the architecture matches baseline performance with 25% less training compute, marking a foundational step toward more agentic AI reasoning.
DISCOVERED
10d ago
2026-04-01
PUBLISHED
10d ago
2026-04-01
RELEVANCE
AUTHOR
Regular-Substance795