BACK_TO_FEEDAICRIER_2
Rose optimizer ships stateless, low-VRAM
OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoOPENSOURCE RELEASE

Rose optimizer ships stateless, low-VRAM

Rose is a new Apache 2.0 PyTorch optimizer that uses per-slice gradient-range normalization instead of momentum-style state. The author says it cuts optimizer memory to zero, trains with low VRAM, and can match or slightly beat AdamW on some small benchmarks and an LLM token-golf test.

// ANALYSIS

This is a genuinely interesting optimizer idea, but the current evidence looks more like a promising niche win than a universal AdamW replacement.

  • Zero optimizer state is the real differentiator: if the claims hold, Rose is attractive anywhere VRAM is the bottleneck and optimizer buffers are painful.
  • The benchmark story is mixed in a healthy way: MNIST is competitive, and the parameter-golf run shows a modest validation improvement, not a blowout.
  • The approach trades historical moments for instantaneous gradient-range statistics, which simplifies memory but makes the optimizer more dependent on per-batch behavior and hyperparameter tuning.
  • The repo’s positioning is practical, not theoretical: easy install, Python 3.10+, PyTorch 2.0+, and explicit support for gradient centralization, trust gating, and BF16 rounding.
  • Broader adoption will hinge on independent replications at larger scale, because small benchmark wins in optimizers can disappear fast outside the author’s setup.
// TAGS
roseopen-sourcemlopsllmfine-tuninggpu

DISCOVERED

4h ago

2026-04-24

PUBLISHED

5h ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

ECF630