GPT-OSS 120B tops 60 tok/sec on M5 Max

// 54d agoBENCHMARK RESULT

GPT-OSS 120B tops 60 tok/sec on M5 Max

OpenAI's 117B parameter MoE model achieves human-reading speeds on the MacBook Pro M5 Max, leveraging 128GB unified memory and the MLX framework. A breakthrough for local inference of high-reasoning models on portable hardware.

// ANALYSIS

The arrival of "workstation-class" performance on a laptop marks the end of cloud dependency for privacy-sensitive professional workflows.

–MoE architecture only activates 5.1B parameters per token, allowing the 120B model to achieve throughput typical of much smaller dense models
–M5 Max's 614 GB/s memory bandwidth is the critical enabler, effectively doubling the performance of prior generations for large-scale local inference
–MXFP4 quantization preserves high precision while fitting the model within 70GB, leaving ample room for 128k context windows on 128GB machines
–Apache 2.0 licensing combined with local hardware provides a viable, HIPAA-compliant alternative to proprietary APIs for clinical and legal document processing

// TAGS

gpt-oss-120bmlxllminferenceopen-sourceapple-siliconedge-ai

DISCOVERED

54d ago

2026-04-04

PUBLISHED

54d ago

2026-04-03

RELEVANCE

9/ 10

AUTHOR

Plus-Conclusion-3169

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS2h ago

Pangram flags Pope's encyclical as Claude-generated

Online sleuths claim Pope Leo's first encyclical, "Magnifica Humanitas," contains text generated by Claude. The Pangram AI detector flagged key paragraphs as 100% AI, supported by linguistic tells like excessive em-dashes and the word "genuinely."

MODEL3h ago

Prism ML launches Bonsai Image 4B variants

Prism ML has released Bonsai Image 4B, a compact text-to-image diffusion model family built from FLUX.2 Klein 4B for local inference on Apple Silicon and NVIDIA GPUs. The launch includes 1-bit and ternary variants, plus Bonsai Studio for trying the model on iPhone.

OPEN SOURCE3h ago

book-to-skill turns PDFs into Claude skills

book-to-skill converts technical PDFs and EPUBs into a reusable Claude Code skill with chapter files, a glossary, patterns, and a cheat sheet. The goal is to turn a book from something you read once into something an agent can query while you work.