BACK_TO_FEEDAICRIER_2
oQ debuts mixed-precision quantization for Apple Silicon
OPEN_SOURCE ↗
REDDIT · REDDIT// 19d agoOPENSOURCE RELEASE

oQ debuts mixed-precision quantization for Apple Silicon

oQ is a data-driven mixed-precision quantizer for Apple Silicon that uses calibration to assign bits per layer instead of forcing one uniform width across a model. It emits standard mlx-lm-compatible models, so the same quantized weights can move across oMLX, mlx-lm, LM Studio, and other MLX-safe-tensors loaders without a custom format.

// ANALYSIS

This is the right instinct for local LLMs: treat precision as a budget to allocate, not a fixed rule to apply everywhere. If oQ keeps the artifact portable, it solves both quality and UX at once.

  • The Qwen3.5-35B-A3B table is the headline: oQ's 2-bit and 3-bit runs beat uniform mlx-lm by a wide margin on MMLU and TruthfulQA, which suggests the sensitivity heuristic is doing real work.
  • The built-in 600-sample calibration set is a practical adoption win because users don't need to assemble their own calibration corpus before trying it.
  • The interoperability story is the real moat: once the model stays MLX-standard, users can quantize once and run anywhere in the Apple Silicon stack.
  • The 4-bit HumanEval dip versus mlx-lm is a healthy caution flag; mixed precision looks promising, but it still needs broader validation across architectures and evals.
// TAGS
oqomlxopen-sourceinferenceedge-aillmmlops

DISCOVERED

19d ago

2026-03-23

PUBLISHED

19d ago

2026-03-23

RELEVANCE

8/ 10

AUTHOR

cryingneko