PyTorch 2.9 ships Muon optimizer

// 115d agoOPENSOURCE RELEASE

PyTorch 2.9 ships Muon optimizer

PyTorch 2.9 adds `torch.optim.Muon`, a specialized optimizer for 2D hidden-layer weights while embeddings, biases, and output heads stay on AdamW (https://docs.pytorch.org/docs/stable/generated/torch.optim.Muon.html). In the current docs it is still a for-loop optimizer with no foreach or fused path, so the immediate win looks more experimental than plug-and-play.

// ANALYSIS

Muon is interesting, but it reads like an optimizer you benchmark carefully, not one you casually flip on for every fine-tune.

–PyTorch’s docs say Muon is meant for 2D hidden-layer parameters; non-2D params still belong on AdamW, so parameter grouping is the first real hurdle.
–The 2.9 optimizer table lists Muon as `for-loop` only, with no foreach or fused implementation, which suggests its gains are algorithmic rather than kernel-level: https://docs.pytorch.org/docs/2.9/optim.html
–For VRAM-constrained training, the appeal is optimizer-state efficiency, not a drop-in replacement for AdamW.
–The Reddit thread is still empty, which fits the current vibe: curious, promising, but not yet battle-tested in the local fine-tuning crowd: https://www.reddit.com/r/LocalLLaMA/comments/1rxe7jl/torchoptimmuon_is_now_in_pytorch_29_anyone/

// TAGS

muonpytorchopen-sourcefine-tuninggpuresearch

DISCOVERED

115d ago

2026-03-18

PUBLISHED

115d ago

2026-03-18

RELEVANCE

8/ 10

AUTHOR

Sensitive-Two9732

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL2h ago

Reve 2.1 drops native 4K rendering

Reve has released version 2.1 of its creative image generation model, introducing native 4K rendering, object-level editing, and a new "Live Layers" feature. The update enables users to perform localized edits and manage layouts directly, catering to professional design workflows requiring precise control.

RESEARCH2h ago

UCSD researchers successfully demonstrate the first in-vivo teleoperated surgical procedures using general-purpose humanoid robots.

Researchers at the University of California San Diego (UCSD) have achieved a milestone in medical robotics by using Unitree G1 general-purpose humanoid robots (nicknamed "Surgie") to perform laparoscopic gallbladder removals on live animal subjects. The study, published in Nature, evaluated a teleoperated humanoid platform that utilizes standard surgical instruments via custom-made hand adapters. In the trials, the researchers successfully demonstrated both human-robot teams (a humanoid operated by a teleoperator assisting a human surgeon) and robot-robot teams (two humanoids working cooperatively) to complete the surgical tasks. This research indicates that while humanoid platforms are currently slower and less precise than specialized systems like the da Vinci, they offer a far more compact, versatile, and cost-effective alternative that could expand surgical access to remote, rural, or emergency settings.

OPEN SOURCE2h ago

ABot-World simulates infinite 720p worlds on single GPU

ABot-World is an open-source, action-conditioned infinite world simulator designed to generate interactive 720p environments at 16 frames per second with low latency on a single desktop GPU. By utilizing an NVIDIA RTX 5090 and requiring just 19GB of GPU memory, this embodied world model offers physical compliance, action controllability, and zero-shot generalization, making real-time, interactive environment simulation accessible on consumer-grade hardware.