BACK_TO_FEEDAICRIER_2
OpenMed trains mRNA models across 25 species
OPEN_SOURCE ↗
REDDIT · REDDIT// 11d agoTUTORIAL

OpenMed trains mRNA models across 25 species

OpenMed published a deep technical walkthrough of its protein AI pipeline, from ESMFold and ProteinMPNN through codon optimization. The standout result is CodonRoBERTa-large-v2, which hit perplexity 4.10 and CAI Spearman 0.404 while scaling to 25 species in 55 GPU-hours.

// ANALYSIS

The interesting part here is not just that OpenMed trained another biological language model, but that it proved a classic RoBERTa-style stack beat a more modern transformer on codon data.

  • CodonRoBERTa-large-v2 clearly beat ModernBERT, which suggests biology-specific inductive bias still matters more than the latest NLP architecture trends
  • The jump from low CAI correlation to 0.404 after hyperparameter tuning is the real win, because it moves the model from “predictive” to biologically useful
  • Training 4 production models across 25 species for $165 is a strong signal that specialized biomedical modeling is becoming cheap enough for small teams
  • The species-conditioned setup is the most defensible product angle here, since it turns a single codon model into a multi-organism system
  • The article reads more like a reproducible research notebook than a marketing post, which makes the claims easier to trust
// TAGS
openmedopen-sourceresearchbenchmarkgpullm

DISCOVERED

11d ago

2026-04-01

PUBLISHED

11d ago

2026-03-31

RELEVANCE

7/ 10

AUTHOR

dark-night-rises