Sarvam-30B gets uncensored abliteration fork
Days after Sarvam AI open-sourced Sarvam 30B, a community contributor published Sarvam-30B Uncensored on Hugging Face, claiming to remove the model’s refusal mechanisms with an “abliteration” weight-surgery method based on recent alignment research. It is a derivative release rather than an official Sarvam update, aimed at open-model experimentation across reasoning, coding, and multilingual use cases.
This is the open-weights ecosystem moving at full speed: a fresh base model lands, and the community immediately starts remixing its alignment layer.
- –The model card says it preserves Sarvam-30B’s architecture and capabilities while projecting out refusal directions across 19 layers and the lm_head, making this more than a simple jailbreak prompt pack
- –The release is notable for Indian-language AI because the base Sarvam-30B was positioned as a strong 22-language reasoning model, so uncensored derivatives could quickly attract benchmarking and fine-tuning interest
- –For developers, the interesting angle is research and evaluation: it is a concrete testbed for studying alignment, refusal circuits, and post-training safety tradeoffs in open models
- –For production use, the warning is obvious: the model card explicitly says built-in safety filters are gone, so this is a lab artifact, not something to drop into user-facing apps without strong external guardrails
DISCOVERED
79d ago
2026-03-10
PUBLISHED
79d ago
2026-03-10
RELEVANCE
AUTHOR
Available-Deer1723