BACK_TO_FEEDAICRIER_2
Qwen3 0.6B quant benchmarks prove Q5 optimal
OPEN_SOURCE ↗
REDDIT · REDDIT// 10d agoBENCHMARK RESULT

Qwen3 0.6B quant benchmarks prove Q5 optimal

A developer published community benchmarks for the Qwen3 0.6B model across Q2 to Q8 quantization levels, testing math, instruction following, and knowledge retrieval. The results highlight a severe drop in reasoning capabilities at extreme compression, while factual knowledge scores remarkably flatline.

// ANALYSIS

These community tests provide a rare empirical look at how aggressive quantization degrades sub-1B parameter models, confirming Q5 as the ideal balance.

  • GSM8K math performance collapses from 44.9% at Q8 to just 3.1% at Q2, proving logic is highly sensitive to compression
  • IFEval instruction following degrades from 18.7% to 12.9%, showing formatting constraints are slightly more resilient than pure reasoning
  • MMLU knowledge scores remained completely flat at 22.9% across all quants, suggesting the model's factual recall floor is reached immediately
  • The marginal 10 tok/s speed difference between Q2 and Q8 is noticeable but likely not worth the massive intelligence tax
// TAGS
qwen3benchmarkllminferenceopen-weights

DISCOVERED

10d ago

2026-04-02

PUBLISHED

10d ago

2026-04-02

RELEVANCE

7/ 10

AUTHOR

PraxisOG