Nemotron Cascade 2 tops math, code
NVIDIA’s open 30B MoE reasoning model uses just 3B active parameters and is post-trained from Nemotron-3-Nano-30B-A3B-Base. NVIDIA says it reaches gold-medal-level results on IMO, IOI, and ICPC-style tasks, while supporting both thinking and instruct modes.
NVIDIA is making a strong case that post-training, not brute scale, is what buys frontier-ish performance here.
- –The benchmark table suggests this 30B model can hang with or beat NVIDIA’s 120B-class sibling on several math and code tasks.
- –Cascade RL plus multi-domain on-policy distillation looks like a reusable recipe for squeezing more capability out of open MoE models.
- –Thinking and instruct modes make it more flexible for agent workflows, not just benchmark demos.
- –Open weights plus released data/checkpoints make it unusually attractive for researchers and local model tinkerers.
DISCOVERED
68d ago
2026-03-20
PUBLISHED
68d ago
2026-03-20
RELEVANCE
AUTHOR
Middle_Bullfrog_6173