MiniMax M2.7 quant trims to 74GB
JANGQ-AI’s MiniMax-M2.7-JANGTQ_K repackages MiniMax M2.7 into a mixed-bit MLX quant for local Apple Silicon use. The card puts the bundle at about 74 GB on disk, with more bits spent on sensitive down-projection weights and heavier compression elsewhere.
This is less a new model than a better packaging strategy for running a big MoE locally. The interesting part is the bit allocation: it tries to preserve quality where residual-stream noise hurts most while squeezing the rest hard enough to make the model more practical on high-RAM Macs.
- –The release targets a narrow but real audience: people who want frontier-ish MiniMax M2.7 behavior without hauling around the full FP8 source.
- –The mixed-bit scheme is the point, not just the file size; it explicitly trades 4-bit capacity for `down_proj` and 2-bit compression for `gate_proj`/`up_proj`.
- –The pre-stacked layout removes runtime restacking, which should make cold loads simpler and more predictable in MLX workflows.
- –The non-commercial license and 74 GB footprint mean this is still a power-user artifact, not a broadly deployable everyday model.
- –Compared with the slimmer 2-bit JANGTQ variant, this is the quality-first option in the same local-inference family.
DISCOVERED
1h ago
2026-05-08
PUBLISHED
2h ago
2026-05-07
RELEVANCE
AUTHOR
cafedude