OPEN_SOURCE ↗
REDDIT · REDDIT// 31d agoMODEL RELEASE
Nemotron 3 Super targets agentic reasoning
NVIDIA has released Nemotron 3 Super, a 120B-parameter open-weight hybrid Mamba-Transformer MoE model with 12B active parameters and a native 1M-token context window. It is built for long-horizon multi-agent workloads like software development and cybersecurity, with NVIDIA claiming over 5x the throughput of the previous Nemotron Super plus open datasets, recipes, and deployment guides.
// ANALYSIS
This is NVIDIA making a serious play for the “agent brain” layer: not just bigger reasoning, but cheaper, longer-context reasoning that can stay aligned across sprawling multi-agent workflows.
- –The 120B total / 12B active setup matters because it aims to keep inference costs down while still scaling to harder reasoning and coding tasks
- –A native 1M-token context window directly targets the context explosion problem that breaks many long-running agent systems
- –The hybrid Mamba-Transformer design, latent MoE, and multi-token prediction show NVIDIA optimizing for throughput as much as raw benchmark bragging rights
- –Open weights, datasets, and recipes make this more useful to developers than a closed API-only release, especially for teams that want to fine-tune or self-host
- –NVIDIA is also tying the model tightly to its own stack, from Blackwell NVFP4 optimization to NeMo, TensorRT-LLM, and NIM deployment paths
// TAGS
nemotron-3-superllmreasoningagentopen-weightsinference
DISCOVERED
31d ago
2026-03-11
PUBLISHED
31d ago
2026-03-11
RELEVANCE
9/ 10
AUTHOR
deeceeo