BACK_TO_FEEDAICRIER_2
MII-LLM releases Zagreus, Nesso small models
OPEN_SOURCE ↗
REDDIT · REDDIT// 14h agoMODEL RELEASE

MII-LLM releases Zagreus, Nesso small models

MII-LLM’s report details how it trained a family of 0.4B bilingual LLMs from scratch for Italian, Spanish, French, and Portuguese. The release includes four base Zagreus checkpoints, three Nesso post-trained variants, and a fully open recipe built around edge deployment.

// ANALYSIS

This is a strong example of small-model engineering done seriously: the value is not just the weights, but the full reproducible pipeline from tokenization to Slurm orchestration to post-training. In the sub-1B regime, disciplined data and training choices matter more than architectural novelty.

  • Dense 0.4B is the sensible call here; MoE complexity is hard to justify when stability and hardware utilization are the bottlenecks.
  • The bilingual English + target-language setup is a practical way to cover European languages without pretending a tiny model can be universal.
  • Nesso-agentic is likely the most useful checkpoint in the set because structured output and function calling are where small models can still feel “product-ready.”
  • The benchmark story is encouraging, but the real ceiling remains visible: arithmetic, factual recall, and repetition are still weak points.
  • The open variant is the most interesting piece for the ecosystem, because reproducible small-model training is still rare and highly transferable.
// TAGS
zagreusnessomii-llmllmedge-aiopen-sourcebenchmarkmlops

DISCOVERED

14h ago

2026-04-17

PUBLISHED

14h ago

2026-04-17

RELEVANCE

9/ 10

AUTHOR

kazzus78