OPEN_SOURCE ↗
REDDIT · REDDIT// 29d agoBENCHMARK RESULT
M5 Max MacBook Pro nears RTX laptop LLMs
Reddit’s LocalLLaMA community is dissecting Hardware Canucks’ first M5 Max laptop tests, which put Apple’s new MacBook Pro roughly alongside RTX 5080 and 5090 laptops on a small LM Studio DeepSeek R1 14B run while using far less power. The bigger caveat is that the benchmark set is still thin and skips the prompt-processing and larger-model numbers that matter most for serious local inference.
// ANALYSIS
The M5 Max looks like a real contender for on-device LLM work, but these numbers are still more teaser than verdict. Apple’s real advantage is not winning tiny-model token races, it is keeping bigger models usable once mobile GPU VRAM runs out.
- –The clearest early datapoint is about 59 tok/s on DeepSeek R1 14B in LM Studio, close to laptop RTX 5080 and 5090 results in the same roundup.
- –Reddit commenters focused on the M5 Max’s 614GB/s unified memory bandwidth, arguing that Apple closes the gap when models no longer fit cleanly inside 24GB to 32GB mobile GPU memory.
- –The missing data is prompt processing, long context, and larger 30B+ or MoE workloads, which are usually more revealing than a single small-model decode test.
- –For AI developers, the interesting angle is efficiency and memory headroom: a quieter, battery-friendly laptop that can run larger local models may matter more than topping raw tokens-per-second charts.
// TAGS
macbook-prollmbenchmarkinferencegpu
DISCOVERED
29d ago
2026-03-14
PUBLISHED
33d ago
2026-03-09
RELEVANCE
7/ 10
AUTHOR
themixtergames