Evo 2 opens genome-scale DNA AI
Arc Institute and NVIDIA have released Evo 2, a 40B-parameter open-source genomic foundation model trained on more than 9 trillion nucleotides from over 128,000 genomes across all domains of life. It can reason over up to 1 million bases at once, predict variant effects and splice sites, and generate new biological sequences, making it one of the most ambitious foundation-model pushes yet in computational biology.
Evo 2 is the rare biology model release that feels like real platform infrastructure, not just a one-off paper result. The bigger question now is not whether the model is impressive, but how much of its zero-shot genomic intuition will survive contact with wet-lab reality.
- –The scale jump is serious: Arc says Evo 2 extends Evo 1 with 30x more training data and 8x longer context, which matters because genome function often depends on long-range structure
- –Open release is a big deal here because the team published model weights, training code, inference code, and the OpenGenome2 dataset instead of keeping the stack closed
- –The most developer-interesting result is generalization: one model handles bacteria, archaea, and eukaryotes well enough to spot regulatory DNA, splice sites, and pathogenic mutations without task-specific fine-tuning
- –The strongest practical use case is likely variant interpretation and genome annotation, where saving researchers rounds of brute-force experimental screening could be immediately valuable
- –The caveat is biological validation: sequence generation and regulatory design are exciting, but the gap between plausible DNA and reliably useful biology is still huge
DISCOVERED
37d ago
2026-03-06
PUBLISHED
38d ago
2026-03-05
RELEVANCE
AUTHOR
Secure-Technology-78