Helios claims real-time minute-long video
Helios is a new 14B autoregressive diffusion video model from Peking University and collaborators that claims 19.5 FPS on a single H100 while supporting minute-scale generation and text-to-video, image-to-video, and video-to-video workflows. The bigger story is that it is not just a paper drop: the team has already published code, weights, Hugging Face checkpoints, and integrations with Diffusers, vLLM-Omni, SGLang-Diffusion, and Ascend-NPU.
If Helios holds up beyond curated demos, this is less a “best clip wins” story and more a signal that long-form video generation is becoming a deployment and systems problem. Shipping code and weights immediately makes it much more relevant to developers than the usual paper-only video model release.
- –The headline number is 19.5 FPS on one H100, which would make minute-long generation dramatically more practical for iteration than most heavyweight video stacks
- –The paper’s boldest claim is architectural: Helios says it maintains long-video coherence without common anti-drift tricks and hits speed without standard acceleration shortcuts like KV-cache or quantization
- –The release is unusually usable out of the gate, with Base, Mid, and Distilled checkpoints plus day-0 support across popular inference frameworks
- –The main caveat is validation: these are very fresh results, so the real test is whether outside researchers can reproduce the quality, coherence, and cost profile in open benchmarks and real workloads
DISCOVERED
35d ago
2026-03-08
PUBLISHED
35d ago
2026-03-08
RELEVANCE
AUTHOR
AI Search