Adaptive-Quantization brings SNR-based, per-tensor llama.cpp quantization

// 86d agoOPENSOURCE RELEASE

Adaptive-Quantization brings SNR-based, per-tensor llama.cpp quantization

A new open-source toolkit replaces llama.cpp's opaque Q4_K_M-style labels with filenames like `Qwen3.5-9B_12.6GB_45dB` that expose actual file size and signal-to-noise ratio. Scripts survey every tensor at every quantization level to produce mixed-precision GGUF files optimized for either a quality floor or a VRAM budget.

// ANALYSIS

The QX_Y naming scheme has always been a proxy for quality — this project cuts through the abstraction by measuring what actually matters.

–Per-tensor SNR profiling can yield models 3–6% smaller than uniform quantization at equivalent quality, a meaningful win for local inference
–Filename convention `ModelName_SizeGB_SNRdB` gives practitioners instant, comparable quality signals without running benchmarks
–Target-driven quantization (`--db 30` for a quality floor or `--size 8G` for a VRAM budget) flips the workflow: specify constraints, get the optimal plan
–`verify_gguf.py` closes the loop with post-quantization validation so users know the achieved SNR, not just the planned one
–Community traction is early (score 1 on r/LocalLLaMA), but the idea directly addresses a recurring pain point in the local-LLM space

// TAGS

adaptive-quantizationllminferenceopen-sourcedevtooledge-ai

DISCOVERED

86d ago

2026-03-16

PUBLISHED

86d ago

2026-03-16

RELEVANCE

7/ 10

AUTHOR

bigattichouse

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL31m ago

Claude Fable 5 launch sparks massive developer backlash

Anthropic's Claude Fable 5 launch faces severe developer backlash over aggressive safety restrictions, high pricing, and a forced 30-day data retention policy. The model silently routes chemistry, biology, and cybersecurity requests to the older Opus 4.8 model, frustrating users with opaque downgrades and anti-distillation blocks.

MODEL31m ago

Designers praise Claude Fable 5 landing pages

Educator and designer Meng To highlighted Claude Fable 5's capability for creating landing pages on X, calling the model "a monster" for the task. Released in June 2026, Claude Fable 5 is Anthropic's latest Mythos-class AI model, featuring a 1-million-token context window, a 128,000-token output capacity, and advanced reasoning for long-horizon agentic workflows, making it highly effective for complex design and front-end code generation tasks.

MODEL1h ago

Claude Fable 5 hits Google Cloud

Anthropic's new Mythos-class frontier AI model, Claude Fable 5, is now generally available on Google Cloud's Agent Platform (Vertex AI). Designed for complex, long-horizon reasoning and autonomous workflows, Fable 5 is built for tasks such as software engineering, deep research, and multi-day agentic execution, featuring built-in safety guardrails that automatically redirect sensitive queries to Claude Opus 4.8.