LM Studio M5 speeds look uneven

// 59d agoBENCHMARK RESULT

LM Studio M5 speeds look uneven

A user with a 32GB M5 MacBook Pro is sanity-checking LM Studio throughput after seeing 8 t/s on Gemma 3 27B 4-bit MLX, 32 t/s on Nemotron 3 Nano 4B GGUF, and 39 t/s on GPT OSS 20B MLX under default context settings. The thread is a call for comparable M5 or MacBook Air/Pro numbers to see whether the slowdown is normal or a tuning issue.

// ANALYSIS

Local AI speed on Apple Silicon is a moving target, and this reads less like a broken machine than a calibration check for LM Studio's newest runtime path.

–LM Studio officially supports both `llama.cpp` GGUF and Apple `MLX` on Apple Silicon, and Apple calls out LM Studio as one of the apps that should benefit from M5's Neural Accelerators.
–LM Studio has already shipped M5-specific MLX NAX auto-upgrade fixes, so benchmark comparisons on this chip age fast and older runtime builds can undershoot.
–Community M5 reports already show Nemotron nano 4bit around ~55 t/s after a runtime switch, which makes the poster's 32 t/s believable but not especially fast.
–The 27B Gemma run is the outlier: 8 t/s points to a bandwidth-heavy workload, a heavier context, or a model/backend pairing that is not hitting the chip's best path.

// TAGS

lm-studiollminferencebenchmarkgpuself-hosted

DISCOVERED

59d ago

2026-03-29

PUBLISHED

59d ago

2026-03-29

RELEVANCE

8/ 10

AUTHOR

nemuro87

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS1h ago

Dev lets Claude trade BTC overnight, nets $95 profit

A developer gave Claude a $20 budget to autonomously script and execute Bitcoin trades overnight, waking up to a functional trading bot and a $95 profit across five trades.

OPEN SOURCE2h ago

Plannotator 0.19.24 adds Amp support and configurable storage

Plannotator 0.19.24 is a substantial release that expands the tool beyond Claude Code with native Amp support, adds a `PLANNOTATOR_DATA_DIR` override so users can move the default `~/.plannotator` data directory, introduces Auto Mode in the permission selector for newer Claude Code versions, and fixes a Pi approval crash after plan acceptance. The update folds multiple stacked PRs into one release and pushes the project further toward a multi-agent review layer rather than a single-agent hook utility.

NEWS2h ago

Aaronson says AI turns mathematicians into curators

Scott Aaronson says recent AI results in mathematics, including a GPT-5.5 Pro solution to Erdős’s Unit Distance Problem, suggest humans may increasingly focus on choosing questions and interpreting model outputs. He extends the argument to AI-written fiction and the Vatican’s AI encyclical as signs of a broader cultural shift.