OPEN_SOURCE ↗
REDDIT · REDDIT// 10d agoBENCHMARK RESULT
Qwen3 0.6B quant benchmarks prove Q5 optimal
A developer published community benchmarks for the Qwen3 0.6B model across Q2 to Q8 quantization levels, testing math, instruction following, and knowledge retrieval. The results highlight a severe drop in reasoning capabilities at extreme compression, while factual knowledge scores remarkably flatline.
// ANALYSIS
These community tests provide a rare empirical look at how aggressive quantization degrades sub-1B parameter models, confirming Q5 as the ideal balance.
- –GSM8K math performance collapses from 44.9% at Q8 to just 3.1% at Q2, proving logic is highly sensitive to compression
- –IFEval instruction following degrades from 18.7% to 12.9%, showing formatting constraints are slightly more resilient than pure reasoning
- –MMLU knowledge scores remained completely flat at 22.9% across all quants, suggesting the model's factual recall floor is reached immediately
- –The marginal 10 tok/s speed difference between Q2 and Q8 is noticeable but likely not worth the massive intelligence tax
// TAGS
qwen3benchmarkllminferenceopen-weights
DISCOVERED
10d ago
2026-04-02
PUBLISHED
10d ago
2026-04-02
RELEVANCE
7/ 10
AUTHOR
PraxisOG