OpenMythos Qwen Shell runs transplanted Qwen2.5-1.5B
OpenMythos Qwen Shell is an experimental compatibility checkpoint that loads Qwen2.5-1.5B-Instruct weights into an OpenMythos PyTorch shell. It demonstrates coherent generation through the OpenMythos forward path, but it is not the final compressed recurrent OpenMythos model.
This is a promising architecture proof, not a finished product release.
- –The interesting part is compatibility: Qwen2.5-1.5B-Instruct can generate coherently inside the OpenMythos runtime without collapsing the forward path.
- –The repo is still a shell/transplant checkpoint, not the final compact recurrent OpenMythos model the author says they are distilling toward.
- –Disabling MoE and collapsing the recurrent slot into a normal block makes the current model much closer to a dense Qwen-style checkpoint than to the intended full OpenMythos design.
- –The fp32 RMSNorm note matters: this reads like a fragile but useful engineering bridge, not a clean architecture swap.
- –The claim that random initialization for recurrent parts catches up faster than full-from-scratch pretraining is the most consequential takeaway, but it is still anecdotal here.
DISCOVERED
45d ago
2026-04-26
PUBLISHED
45d ago
2026-04-25
RELEVANCE
AUTHOR
Creative-Ad-2112