NVIDIA Nemotron 3 Super: open-weight 120B MoE, 1M context

// 74d agoMODEL RELEASE

NVIDIA Nemotron 3 Super: open-weight 120B MoE, 1M context

NVIDIA has released Nemotron Super, a 120B open-weight hybrid Mamba-Transformer MoE model activating only 12B parameters at inference, with a 1-million-token context window built for agentic workflows. It ships with full open weights, 25T-token pretraining data, and training recipes alongside same-day integrations across AWS, Azure, Google Cloud, and major inference providers.

// ANALYSIS

NVIDIA is playing the long game in open-weights AI: not just releasing a model, but the full stack — data, recipes, RL environments — making Nemotron Super a platform, not just a checkpoint.

–The Mamba-Transformer hybrid architecture is genuinely novel at this scale: linear-time Mamba layers handle long context cheaply while Transformer attention handles precise recall, sidestepping the memory wall that kills dense models at 1M tokens
–12B active parameters from a 120B pool means inference cost is closer to a 12B model — competitive with Llama-class efficiency while vastly outperforming it on context length
–Multi-Token Prediction delivering 3x wall-clock speedups for structured generation is huge for agentic use cases where output volume (tool calls, code) dominates latency
–Same-day enterprise adoption from Perplexity, CodeRabbit, Palantir, and Cloudflare Workers AI signals this isn't a research drop — it's production-ready
–NVFP4 native pretraining is a subtle but strategic move: it locks in Blackwell GPU advantages and widens the perf gap for anyone running on NVIDIA hardware

// TAGS

nemotron-3-supernvidiallmopen-weightsagentinferencemcpreasoningopen-source

DISCOVERED

74d ago

2026-03-14

PUBLISHED

76d ago

2026-03-12

RELEVANCE

9/ 10

AUTHOR

No-Swing2206

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE5h ago

Cursor adds dedicated subagents for skills

Cursor now allows developers to execute tool-heavy or research-intensive agent skills within dedicated subagents. This architectural shift isolates noisy background tasks, keeping the main chat context clean and focused.

UPDATE5h ago

YouTube moves AI labels to video player

YouTube is moving its AI content disclosures from video descriptions to more prominent placements beneath the player and on Shorts overlays. Starting in May, the platform will use internal signals to automatically label photorealistic AI content that creators fail to disclose.

OPEN SOURCE9h ago

Taste Skill kills AI "frontend slop"

Taste-Skill is an open-source framework that provides portable "agent skills" to enforce high-end design principles in AI-generated code. By injecting specific design directives and "anti-slop" rules, it enables LLMs to produce editorial-grade UIs that bypass generic, boilerplate-heavy AI templates.