Qwen3.6 mini GGUFs trim MTP grafting
This project strips Qwen3.6 GGUFs down to just the MTP tensors needed by buzz’s grafting script. The result is two tiny donor files, about 900MB for the 35A3B variant and 451MB for the 27B variant, instead of full 38GB and 29GB downloads.
Useful niche plumbing: it does not make inference easier by itself, but it removes the most annoying part of the MTP grafting workflow for people already managing local model libraries.
- –The payoff is bandwidth and setup time, not new capability; these are compatibility shims for an existing conversion script.
- –The author claims SHA256 parity against outputs made from the full models, which is the right validation for this kind of utility.
- –Scope is narrow: only the two tested Qwen3.6 variants are covered, and the post itself warns that other model variants may fail.
- –The artifact depends on an unstable MTP ecosystem, so treat it as a convenience layer rather than something to archive as canonical.
DISCOVERED
1h ago
2026-05-08
PUBLISHED
2h ago
2026-05-07
RELEVANCE
AUTHOR
AzerbaijanNyan