Nemotron 3 Super Uncensored Hits 92% MMLU
Dealign.AI's Mac-only MLX build of Nemotron 3 Super strips the safety layer from NVIDIA's open-weight 120B-class model and claims a 92% MMLU score, up from 86% for the base JANG_2L variant. If reproducible, that points to meaningful benchmark sensitivity around alignment and chat behavior, not just raw model size.
This is a fun local-model data point, but also a reminder that benchmark jumps after "uncensoring" can come from prompting, formatting, or reasoning-mode differences as much as from actual capability gains.
- –NVIDIA's official Nemotron 3 Super is already a strong open-weight hybrid MoE reasoning model, so a community ablated fork can plausibly change benchmark behavior without changing the backbone.
- –The reported 92% vs 86% gap is notable, but it is still self-reported until someone else reruns the same eval with identical settings and template handling.
- –Mac-only MLX support matters here: a 46 GB 120B-class model that runs locally on Apple Silicon is unusually practical for power users.
- –The bigger takeaway is that "safety" layers, chat templates, and reasoning toggles can materially affect multiple-choice scores, which makes benchmark interpretation tricky.
DISCOVERED
68d ago
2026-03-21
PUBLISHED
68d ago
2026-03-21
RELEVANCE
AUTHOR
HealthyCommunicat
