NVIDIA Cosmos 3 hits DeepInfra serverless
DeepInfra has added serverless inference support for NVIDIA's Cosmos 3 physical AI foundation model family, starting with the 16B Cosmos 3 Nano model. The Mixture-of-Transformers architecture enables sub-second reasoning and generation for physical AI applications like robotics and autonomous vehicles without local GPU requirements.
Hosting Cosmos 3 on DeepInfra democratizes access to low-latency physical AI reasoning, but the true test will be whether the serverless model's latency can meet the strict real-time requirements of real-world edge robotics.
* Cosmos 3 Nano's 16B parameter size is optimized for sub-second inference, making it suitable for latency-sensitive applications.
* The Mixture-of-Transformers architecture represents a shift towards models that reason about physics and action before generating output.
* Serverless hosting on DeepInfra significantly reduces the cost and infrastructure complexity for developers building prototypes in robotics and simulation.
DISCOVERED
1h ago
2026-06-03
PUBLISHED
1h ago
2026-06-03
RELEVANCE
AUTHOR
DeepInfra