YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

GPT-OSS 120B flies on M5 Max

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

GPT-OSS 120B flies on M5 Max
OPEN LINK ↗
// 83d agoBENCHMARK RESULT

GPT-OSS 120B flies on M5 Max

A Reddit benchmark on an M5 Max 128GB compares three ~120B models: Nemotron-3 Super, GPT-OSS 120B, and Qwen3.5 122B. GPT-OSS lands behind Nemotron on quality but comes in far ahead on speed, at roughly 77 tokens/sec versus about 35 for the others.

// ANALYSIS

This is a strong reminder that local-LLM performance is no longer just about raw parameter count. On Apple silicon, model architecture and quantization can matter as much as size, and that changes which models feel practical day to day.

  • Nemotron-3 Super looks like the best pick if you care most about answer quality in this specific test.
  • GPT-OSS 120B is the most interesting result because its throughput is high enough to make a 120B model feel interactive.
  • Qwen3.5 122B trailing both suggests “bigger” does not automatically mean “better” once you factor in runtime efficiency.
  • The result is still anecdotal, so it is useful as a real-world signal, not a universal ranking.
// TAGS
gpt-oss-120bllmbenchmarkopen-weightsinferenceself-hosted

DISCOVERED

83d ago

2026-03-19

PUBLISHED

83d ago

2026-03-18

RELEVANCE

8/ 10

AUTHOR

albertgao