BACK_TO_FEEDAICRIER_2
AutoMuon drops as drop-in AdamW replacement
OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoOPENSOURCE RELEASE

AutoMuon drops as drop-in AdamW replacement

AutoMuon automates the integration of the Muon optimizer into PyTorch training pipelines, acting as a one-line replacement for AdamW. It automatically routes 2D projection weights to Muon for orthogonalized updates while keeping embeddings, norms, and biases on AdamW, eliminating the manual parameter grouping previously required to leverage Muon's training speedups.

// ANALYSIS

AutoMuon democratizes the high-performance Muon optimizer by removing the implementation friction that previously limited its use to specialized "speedrun" repositories.

  • Automates the complex parameter routing required to apply Muon’s orthogonal updates safely
  • Reaches AdamW baseline accuracy in significantly fewer epochs while achieving higher final accuracy on benchmarks
  • Native support for DistributedDataParallel (DDP) and standard PyTorch schedulers ensures production readiness
  • Conservative scanning logic provides a safe fallback to AdamW for ambiguous or custom architectural components
  • Massive potential for reducing compute costs in large-scale transformer and CNN training workloads
// TAGS
automuonopen-sourcemlopsllmbenchmarkdevtool

DISCOVERED

4h ago

2026-04-26

PUBLISHED

4h ago

2026-04-26

RELEVANCE

8/ 10

AUTHOR

Skye7821