OPEN_SOURCE ↗
REDDIT · REDDIT// 22d agoBENCHMARK RESULT
Nemotron 3 Super Uncensored Hits 92% MMLU
Dealign.AI's Mac-only MLX build of Nemotron 3 Super strips the safety layer from NVIDIA's open-weight 120B-class model and claims a 92% MMLU score, up from 86% for the base JANG_2L variant. If reproducible, that points to meaningful benchmark sensitivity around alignment and chat behavior, not just raw model size.
// ANALYSIS
This is a fun local-model data point, but also a reminder that benchmark jumps after "uncensoring" can come from prompting, formatting, or reasoning-mode differences as much as from actual capability gains.
- –NVIDIA's official Nemotron 3 Super is already a strong open-weight hybrid MoE reasoning model, so a community ablated fork can plausibly change benchmark behavior without changing the backbone.
- –The reported 92% vs 86% gap is notable, but it is still self-reported until someone else reruns the same eval with identical settings and template handling.
- –Mac-only MLX support matters here: a 46 GB 120B-class model that runs locally on Apple Silicon is unusually practical for power users.
- –The bigger takeaway is that "safety" layers, chat templates, and reasoning toggles can materially affect multiple-choice scores, which makes benchmark interpretation tricky.
// TAGS
nemotron-3-super-120b-uncensoredllmbenchmarkreasoningsafetyopen-weightsself-hosted
DISCOVERED
22d ago
2026-03-21
PUBLISHED
22d ago
2026-03-21
RELEVANCE
9/ 10
AUTHOR
HealthyCommunicat