YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

oQ debuts mixed-precision quantization for Apple Silicon

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

oQ debuts mixed-precision quantization for Apple Silicon
OPEN LINK ↗
// 65d agoOPENSOURCE RELEASE

oQ debuts mixed-precision quantization for Apple Silicon

oQ is a data-driven mixed-precision quantizer for Apple Silicon that uses calibration to assign bits per layer instead of forcing one uniform width across a model. It emits standard mlx-lm-compatible models, so the same quantized weights can move across oMLX, mlx-lm, LM Studio, and other MLX-safe-tensors loaders without a custom format.

// ANALYSIS

This is the right instinct for local LLMs: treat precision as a budget to allocate, not a fixed rule to apply everywhere. If oQ keeps the artifact portable, it solves both quality and UX at once.

  • The Qwen3.5-35B-A3B table is the headline: oQ's 2-bit and 3-bit runs beat uniform mlx-lm by a wide margin on MMLU and TruthfulQA, which suggests the sensitivity heuristic is doing real work.
  • The built-in 600-sample calibration set is a practical adoption win because users don't need to assemble their own calibration corpus before trying it.
  • The interoperability story is the real moat: once the model stays MLX-standard, users can quantize once and run anywhere in the Apple Silicon stack.
  • The 4-bit HumanEval dip versus mlx-lm is a healthy caution flag; mixed precision looks promising, but it still needs broader validation across architectures and evals.
// TAGS
oqomlxopen-sourceinferenceedge-aillmmlops

DISCOVERED

65d ago

2026-03-23

PUBLISHED

65d ago

2026-03-23

RELEVANCE

8/ 10

AUTHOR

cryingneko