BACK_TO_FEEDAICRIER_2
Abliterlitics punctures HauhauCS lossless claim
OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoBENCHMARK RESULT

Abliterlitics punctures HauhauCS lossless claim

Abliterlitics is a forensic benchmark suite comparing HauhauCS, Heretic, and Huihui abliteration across Qwen 3 and 3.5 models. The takeaway is blunt: HauhauCS can remove refusals very effectively, but its “lossless” framing does not survive benchmark, KL, and weight-analysis scrutiny, especially as model size grows.

// ANALYSIS

This is less a product launch than a stress test of a niche model-editing claim, and it lands hard on the “lossless” marketing. The work is strongest where it stays reproducible and comparative: same base models, same safety suite, same KL methodology, same weight analysis.

  • The headline result is that HauhauCS is effective at uncensoring, but not lossless; TruthfulQA and other capability scores degrade more as models scale up.
  • Heretic looks like the most consistent method overall, while Huihui is highly architecture-dependent, ranging from competitive on small models to broken on Qwen3.5-4B and weak on the 27B.
  • The hybrid Mamba2+Transformer Qwen3.5 family behaves differently from the pure Transformer Qwen3-4B, which is a useful reminder that abliteration methods do not transfer cleanly across architectures.
  • The 27B result is the most important practical signal: even a heavily safety-aligned base model can still be stripped of refusals, but broad tensor edits appear to impose the largest collateral damage.
  • The strongest value not the uncensoring itself, but the methodology and artifact trail: benchmark cards, KL numbers, tensor overlap, and per-layer breakdowns make the claims checkable.
// TAGS
abliterliticsbenchmarksafetyresearchllmqwen

DISCOVERED

5h ago

2026-04-18

PUBLISHED

8h ago

2026-04-18

RELEVANCE

9/ 10

AUTHOR

nathandreamfast