OpenRouter hosts free Nemotron-3 Ultra
NVIDIA's Nemotron-3 Ultra, a 550-billion-parameter open-weights model designed for autonomous agents, is now available on OpenRouter. The Mixture-of-Experts model features 55 billion active parameters, a one-million-token context window, and inference speeds exceeding 300 tokens per second.
NVIDIA's Nemotron-3 Ultra proves that frontier-class agentic models can be open-weights, but the hardware requirements to run a 550B parameter model locally mean "open" is a relative term that mostly benefits well-funded enterprises.
* The model's hybrid LatentMoE architecture (combining Transformers and Mamba-2) indicates a shift towards more efficient sequence modeling at extreme scale.
* A 1M context window combined with 300+ tok/s throughput makes it highly suitable for deep research and long-horizon agent orchestration.
* Platforms like OpenRouter will be essential for keeping such massive models accessible to individual developers and startups.
DISCOVERED
2h ago
2026-06-04
PUBLISHED
2h ago
2026-06-04
RELEVANCE
AUTHOR
bridgemindai