Nemotron-3-Super-120B runs uncensored on Apple Silicon

// 119d agoOPENSOURCE RELEASE

Nemotron-3-Super-120B runs uncensored on Apple Silicon

A community release strips safety guardrails from NVIDIA's hybrid Nemotron-Super-120B model using CRACK weight surgery, producing a 4-bit MLX-quantized variant that runs at 43–58 tok/s on Apple Silicon. HumanEval scores of 94% confirm coding capability is largely preserved post-modification.

// ANALYSIS

Community uncensored releases of frontier-class models keep pace with official launches, and Nemotron's novel hybrid architecture made this a genuinely hard technical problem to solve.

–Nemotron-Super-120B's unique three-pathway design — 40 Mamba-2 SSM layers, 40 LatentMoE layers (512 experts, top-22), and 8 attention layers — breaks standard fp16-then-quantize workflows; all surgery must happen at quantization level
–CRACK weight surgery targets the architectural convergence point of all three pathway types, suppressing refusal behavior at the weight level rather than via prompt injection or fine-tuning
–4-bit MLX quant achieves 43–58 tok/s on M3 Ultra 256GB, putting 120B-class local inference within reach for well-equipped Mac users
–LM Studio silently drops 697 essential tensors and is incompatible — only MLX Studio or vMLX work correctly, a notable gotcha for the community
–A chat template workaround introduced occasional missing closing think tags, an acknowledged tradeoff of the approach

// TAGS

llmopen-weightsopen-sourceself-hostedinferencebenchmark

DISCOVERED

119d ago

2026-03-14

PUBLISHED

119d ago

2026-03-14

RELEVANCE

6/ 10

AUTHOR

HealthyCommunicat

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE1h ago

OpenAI launches ChatGPT browser, desktop automation

OpenAI has released new settings for ChatGPT that allow the assistant to browse the web autonomously and execute actions across local desktop applications. Powered by the new GPT-5.6 model family, these features transform ChatGPT from a text-based conversational partner into an agentic tool capable of navigating user environments to perform multi-step tasks.

NEWS3h ago

Zebra stripes trick drone vision AI

Forces in the Ukraine war are painting military vehicles with high-contrast zebra patterns to trick autonomous drone machine-vision algorithms. However, experts note this tactic only offers a temporary advantage as training datasets are quickly updated to recognize the new camouflage.

OPEN SOURCE4h ago

Nuxt surpasses 60,000 GitHub stars

Nuxt, the open-source Vue.js framework, has surpassed 60,000 stars on GitHub, solidifying its position as a leading tool for full-stack web development.

Nemotron-3-Super-120B runs uncensored on Apple Silicon