OPEN_SOURCE ↗
REDDIT · REDDIT// 14h agoMODEL RELEASE
MII-LLM releases Zagreus, Nesso small models
MII-LLM’s report details how it trained a family of 0.4B bilingual LLMs from scratch for Italian, Spanish, French, and Portuguese. The release includes four base Zagreus checkpoints, three Nesso post-trained variants, and a fully open recipe built around edge deployment.
// ANALYSIS
This is a strong example of small-model engineering done seriously: the value is not just the weights, but the full reproducible pipeline from tokenization to Slurm orchestration to post-training. In the sub-1B regime, disciplined data and training choices matter more than architectural novelty.
- –Dense 0.4B is the sensible call here; MoE complexity is hard to justify when stability and hardware utilization are the bottlenecks.
- –The bilingual English + target-language setup is a practical way to cover European languages without pretending a tiny model can be universal.
- –Nesso-agentic is likely the most useful checkpoint in the set because structured output and function calling are where small models can still feel “product-ready.”
- –The benchmark story is encouraging, but the real ceiling remains visible: arithmetic, factual recall, and repetition are still weak points.
- –The open variant is the most interesting piece for the ecosystem, because reproducible small-model training is still rare and highly transferable.
// TAGS
zagreusnessomii-llmllmedge-aiopen-sourcebenchmarkmlops
DISCOVERED
14h ago
2026-04-17
PUBLISHED
14h ago
2026-04-17
RELEVANCE
9/ 10
AUTHOR
kazzus78