llama-swap stumbles on MLX model names
This Reddit post asks whether llama-swap can reliably front mlx_lm.server on an M2 Max, since the server works directly but the proxy setup stalls after loading. The error suggests the alias qwen35-27b-mlx is being treated like a Hugging Face repo id, so the problem looks more like configuration than MLX compatibility.
The hot take is that this looks more like a config mismatch than proof that llama-swap cannot work with MLX.
- –The direct `mlx_lm.server` path already works, so the backend and `/v1/chat/completions` compatibility are not the problem.
- –The failure is a Hugging Face repo resolution error for `qwen35-27b-mlx`, which suggests the proxy is passing the alias where MLX expects a real model path or repo id.
- –llama-swap is designed around wrapping OpenAI-compatible servers, so MLX should be possible in principle, but only if the upstream command and model identifier are wired the way MLX expects.
- –As written, this is a good troubleshooting report for Apple Silicon users, but it does not establish a working MLX + llama-swap setup yet.
DISCOVERED
2h ago
2026-05-09
PUBLISHED
6h ago
2026-05-09
RELEVANCE
AUTHOR
No_Algae1753