MiniMax M2.7 hits GGUF, runs on Apple Silicon
The 229B Mixture-of-Experts (MoE) coding model receives its first GGUF quants, enabling local inference on high-end hardware. Apple Silicon users with 128GB unified memory can now run the Q3_K_L variant of this frontier-level reasoning model.
MiniMax M2.7 is a self-evolving MoE powerhouse that matches GPT-5 and Claude 4.6 in coding benchmarks while maintaining efficiency via its 10B active parameter architecture. The Q3_K_L quant (~110GB) enables 128GB M3 Max users to host a top-tier reasoning model locally for the first time. Its interleaved thinking architecture uses <think> tags to handle complex logic, requiring specific UI support for optimal local use. A massive 196k context window and 256 experts provide high-fidelity performance for long-horizon agentic workflows. Benchmarks like SWE-bench (78%) place it ahead of Claude Opus 4.6 for software engineering tasks, and the modified MIT license limits use to non-commercial research, a significant hurdle for enterprise local-first adoption.
DISCOVERED
7h ago
2026-04-12
PUBLISHED
8h ago
2026-04-12
RELEVANCE
AUTHOR
Remarkable_Jicama775