Heretic 1.3 adds reproducibility, benchmarks

// 45d agoOPENSOURCE RELEASE

Heretic 1.3 adds reproducibility, benchmarks

Heretic 1.3 adds reproducible run artifacts, built-in benchmarking, lower peak VRAM usage, and broader model support. The release turns the project into a more auditable workflow for decensoring and evaluating models without leaving the app.

// ANALYSIS

Heretic is evolving from a clever model-editing tool into a more serious research pipeline: you can now track what produced a result, measure whether it damaged the base model, and fit larger architectures with less memory overhead.

–The new `reproduce` directory is the biggest upgrade because it captures the environment details needed for byte-for-byte reruns, which matters a lot for GPU-dependent tensor ops
–Built-in lm-eval-style benchmarking removes a lot of friction when deciding whether a trial is publishable or worth iterating on
–Peak VRAM reductions are practical, not cosmetic; they let more users run larger models on the same hardware
–Broader layer/module handling is what keeps the project relevant as model families keep changing underneath it
–The optional upload flow for reproducibility data is a good trust signal because it keeps publishing under user control

// TAGS

hereticllmopen-sourceevaluationbenchmarkgpucli

DISCOVERED

45d ago

2026-05-05

PUBLISHED

45d ago

2026-05-05

RELEVANCE

8/ 10

AUTHOR

-p-e-w-

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO1h ago

Sentdex dissects Claude Fable drama, GLM-5.2 launch

AI YouTuber Sentdex has published a new video sorting through the hype and polarization surrounding Anthropic's Claude Fable 5 drama. The video addresses the controversy following the model's sudden government-ordered suspension over jailbreak and national security concerns, while also covering the launch of Z.ai's new open-source mixture-of-experts model, GLM-5.2.

BENCHMARK1h ago

BrowserCode integrates GLM 5.2 support

BrowserCode, a browser agent harness by the browser-use team, has tested the new open-weights model GLM 5.2, reporting near-Opus-level benchmark scores at a significantly lower cost. According to the announcement, a browser-based task using GLM 5.2 in the harness cost only $0.18, proving that open-weights models are catching up to proprietary alternatives while remaining highly cost-effective.

UPDATE1h ago

Higgsfield integrates Grok Imagine Video 1.5

Higgsfield has integrated xAI's new Grok Imagine Video 1.5 model, which features native synchronized audio generation, into its AI video creation platform. This integration allows creators to combine Higgsfield's cinematic camera controls with the high-fidelity video and audio output of xAI's latest model.