OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoOPENSOURCE RELEASE
MLXr turns Macs into OpenAI endpoints
MLXr is a new open-source local inference server and dashboard for Apple Silicon that wraps Apple's MLX stack in an OpenAI-compatible `/v1` API. It aims to remove the usual setup friction around local models by bundling model search, downloads, per-model settings, monitoring, and tool-calling support into one browser-based app.
// ANALYSIS
MLXr is less about novel model tech and more about packaging local MLX inference into something developers can actually live with day to day. That makes it a pragmatic bet on Apple Silicon becoming a serious agent-hosting platform, not just a hobbyist playground.
- –The strongest hook is compatibility: existing OpenAI SDK clients can be pointed at `http://localhost:8000/v1` instead of being rewritten for a new stack.
- –The built-in Hugging Face browser, auto context-window detection, and persisted model settings target the annoying operational gaps that usually keep local setups feeling brittle.
- –Tool-calling normalization across Qwen, Llama, DeepSeek, Mistral, Phi, and Hermes matters more than it sounds; agent workflows tend to break on exactly this kind of model-specific formatting mismatch.
- –The project is very early, with a tiny GitHub footprint so far, which means the concept is promising but production maturity is still unproven.
- –For developers already on Apple Silicon, MLXr sits in the same workflow lane as Ollama and llama.cpp servers, but with a sharper focus on MLX-native UX and OpenAI-client drop-in use.
// TAGS
mlxropen-sourceinferenceapisdkself-hostedagentai-coding
DISCOVERED
3h ago
2026-04-23
PUBLISHED
5h ago
2026-04-23
RELEVANCE
8/ 10
AUTHOR
Squirrel_Glad