OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoOPENSOURCE RELEASE
Rose optimizer ships stateless, low-VRAM
Rose is a new Apache 2.0 PyTorch optimizer that uses per-slice gradient-range normalization instead of momentum-style state. The author says it cuts optimizer memory to zero, trains with low VRAM, and can match or slightly beat AdamW on some small benchmarks and an LLM token-golf test.
// ANALYSIS
This is a genuinely interesting optimizer idea, but the current evidence looks more like a promising niche win than a universal AdamW replacement.
- –Zero optimizer state is the real differentiator: if the claims hold, Rose is attractive anywhere VRAM is the bottleneck and optimizer buffers are painful.
- –The benchmark story is mixed in a healthy way: MNIST is competitive, and the parameter-golf run shows a modest validation improvement, not a blowout.
- –The approach trades historical moments for instantaneous gradient-range statistics, which simplifies memory but makes the optimizer more dependent on per-batch behavior and hyperparameter tuning.
- –The repo’s positioning is practical, not theoretical: easy install, Python 3.10+, PyTorch 2.0+, and explicit support for gradient centralization, trust gating, and BF16 rounding.
- –Broader adoption will hinge on independent replications at larger scale, because small benchmark wins in optimizers can disappear fast outside the author’s setup.
// TAGS
roseopen-sourcemlopsllmfine-tuninggpu
DISCOVERED
4h ago
2026-04-24
PUBLISHED
5h ago
2026-04-24
RELEVANCE
8/ 10
AUTHOR
ECF630