Nemotron-3-Nano-4B abliteration removes GenRM censorship

// 64d agoMODEL RELEASE

Nemotron-3-Nano-4B abliteration removes GenRM censorship

Developer HauhauCS has released the first "aggressive" abliteration of NVIDIA's Nemotron-3-Nano-4B, specifically targeting the Generative Reward Model (GenRM) that acts as a secondary layer of real-time generation censorship. The release includes custom "K_P" quants that use model-specific analysis to provide 1-2 levels of higher quality than standard GGUF quants at a minimal file size increase, achieving a 0/465 refusal score on safety benchmarks.

// ANALYSIS

This release marks a significant escalation in the "cat-and-mouse" game of model alignment by identifying and neutralizing internal reward-driven self-censorship. GenRM removal prevents the "CoT-to-refusal" pivot where a model reasons correctly but switches to a canned refusal in the final output. The hybrid Mamba2-Transformer architecture offers high performance with a massive 262K native context window in a compact 4B parameter size. Custom K_P quants represent a meaningful optimization for local LLM users, squeezing Q6 performance into near-Q4 file sizes. This is the first public demonstration that reward model layers can be systematically ablated without degrading the base model's reasoning capabilities. Native tool-calling support remains intact, making it a viable uncensored candidate for autonomous local agents.

// TAGS

llmopen-weightsmambareasoningnemotronhauhaucsnemotron-3-nano-4b-uncensored-aggressive

DISCOVERED

64d ago

2026-03-25

PUBLISHED

64d ago

2026-03-25

RELEVANCE

9/ 10

AUTHOR

hauhau901

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

BENCHMARK26m ago

Gemma 4 31B stalls on MacBook M5 Max

Google's Gemma 4 31B model exhibits a 42-second initial latency on Apple M5 Max hardware due to a Flash Attention implementation bug. The bottleneck highlights a critical software-hardware mismatch in the latest hybrid attention architectures.

TUTORIAL27m ago

GPT Image 2, Seedance 2.0 prompt workflow drops

AI artist Kōda (@aimikoda) unveils a high-fidelity storyboarding workflow combining GPT Image 2's reasoning with Seedance 2.0's industrial-grade video consistency. The system uses typographic mastheads and multi-model prompting to maintain character identity across 15-second cinematic sequences.

NEWS55m ago

ElevenLabs, Greece partner on voice AI gov services

ElevenLabs signed a Memorandum of Understanding with the Greek government to integrate voice AI into the gov.gr portal, automate public service call centers, and preserve regional dialects like Cretan. The initiative aims to modernize bureaucracy and tourism through natural language interaction and linguistic heritage preservation.