Mac mini 32GB Hits 34 tok/s on gpt-oss-20b

// 71d agoBENCHMARK RESULT

Mac mini 32GB Hits 34 tok/s on gpt-oss-20b

A Reddit user shared a concrete local inference benchmark for LM Studio on a Mac mini with 32GB of memory, running Unsloth’s gpt-oss-20b-Q4_K_S.gguf at a 26,035-token context. With OpenClaw 2026.3.8, LM Studio 0.4.6+1, and mostly default inference settings, the setup reportedly reached 34 tok/s and about 0.7 seconds to first token after the first prompt.

// ANALYSIS

Real-world local LLM benchmarks like this are useful because they show what a 20B open-weight model feels like on mainstream Apple hardware, not just in polished demos.

–The setup is specific enough to be actionable: Mac mini 32GB, LM Studio 0.4.6+1, Q4_K_S quantization, 26k context, and mostly default runtime settings.
–34 tok/s with sub-second TTFT is a strong practical result for local chat, especially at that context length.
–This is still a single-user datapoint, so it should be read as directional rather than a controlled benchmark suite.
–The bigger takeaway is that this class of open-weight model is now comfortably usable on a 32GB desktop.

// TAGS

lm-studiogpt-oss-20blocal-llmmac-minibenchmarkapple-siliconinferencequantization

DISCOVERED

71d ago

2026-03-18

PUBLISHED

71d ago

2026-03-18

RELEVANCE

8/ 10

AUTHOR

groover75

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO2h ago

Viral video teases Claude Opus 4.8

A viral video directed by Miguel07Code showcases impressive "hyperframes" camera movements, allegedly generated by Claude Opus 4.8. The post has sparked speculation about Claude's video generation capabilities.

LAUNCH2h ago

Browser Use Terminal launches Rust web-agent TUI

Browser Use Terminal is a new Rust-based TUI that lets developers automate and steer browser tasks directly from the command line. It combines a lightweight LLM harness with direct CDP control over Chrome for highly observable, interactive automation.

NEWS2h ago

Developer automates BTC trading with Claude, nets profit

A developer tasked Claude with a $20 budget to autonomously trade Bitcoin overnight, resulting in a completed script that successfully executed five trades for a $95 profit. The experiment showcases the increasing capability of LLMs to generate functional, profitable algorithmic trading systems with minimal oversight.