NVIDIA drops SANA-WM video world model
NVIDIA's 2.6B parameter SANA-WM brings open-source, minute-scale video generation to consumer GPUs with precise camera control. It achieves 36x higher throughput than previous baselines while maintaining high visual fidelity.
SANA-WM is a category-defining moment for open-source world modeling, providing a viable, controllable alternative to closed systems like Sora.
- –Hybrid Linear Diffusion Transformer enables 60-second video generation without memory bottlenecks
- –Precise 6-DoF camera control makes it a powerful tool for robotics and autonomous system simulation
- –Optimized NVFP4 quantization allows high-resolution generation on single consumer GPUs like the RTX 5090
- –Two-stage pipeline with a 17B-parameter refiner ensures temporal consistency and sharp textures across long clips
- –Open-source Apache 2.0 licensing lowers the barrier for developers to build specialized video and simulation apps
DISCOVERED
1h ago
2026-05-17
PUBLISHED
1h ago
2026-05-17
RELEVANCE
AUTHOR
AI Search