Rose optimizer ships stateless, low-VRAM

// 45d agoOPENSOURCE RELEASE

Rose optimizer ships stateless, low-VRAM

Rose is a new Apache 2.0 PyTorch optimizer that uses per-slice gradient-range normalization instead of momentum-style state. The author says it cuts optimizer memory to zero, trains with low VRAM, and can match or slightly beat AdamW on some small benchmarks and an LLM token-golf test.

// ANALYSIS

This is a genuinely interesting optimizer idea, but the current evidence looks more like a promising niche win than a universal AdamW replacement.

–Zero optimizer state is the real differentiator: if the claims hold, Rose is attractive anywhere VRAM is the bottleneck and optimizer buffers are painful.
–The benchmark story is mixed in a healthy way: MNIST is competitive, and the parameter-golf run shows a modest validation improvement, not a blowout.
–The approach trades historical moments for instantaneous gradient-range statistics, which simplifies memory but makes the optimizer more dependent on per-batch behavior and hyperparameter tuning.
–The repo’s positioning is practical, not theoretical: easy install, Python 3.10+, PyTorch 2.0+, and explicit support for gradient centralization, trust gating, and BF16 rounding.
–Broader adoption will hinge on independent replications at larger scale, because small benchmark wins in optimizers can disappear fast outside the author’s setup.

// TAGS

roseopen-sourcemlopsllmfine-tuninggpu

DISCOVERED

45d ago

2026-04-24

PUBLISHED

45d ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

ECF630

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS33m ago

Siri gains agentic capabilities at Apple WWDC

A commentary on Apple's WWDC keynote highlights Siri's new contextual awareness and agentic capabilities. While deep context access provides unprecedented situational awareness, Siri's agentic actions operate within predefined boundaries.

OPEN SOURCE55m ago

Roboflow Supervision simplifies computer vision development with its comprehensive suite of reusable, model-agnostic tools.

Roboflow's Supervision is an open-source Python library designed to streamline the implementation of computer vision workflows. By providing modular components for loading datasets, running detections, visualizing bounding boxes, and monitoring interactive zones, it eliminates repetitive boilerplate code. Supervision works seamlessly across popular models and frameworks like YOLO, Hugging Face Transformers, SAM, and Detectron2, serving as a cohesive bridge between model outputs and final applications.

UPDATE1h ago

Quartr launches native MCP connector for Perplexity Computer

Quartr has officially launched as a native Model Context Protocol (MCP) connector for Perplexity Computer. This integration allows users to directly access 43 distinct financial and investor relations tools from Quartr within the Perplexity environment, expanding the platform's capabilities for financial research and analysis.