Wan2.2 makes short clips, trails frontier models

// 48d agoOPENSOURCE RELEASE

Wan2.2 makes short clips, trails frontier models

ANNOUNCEMENT PRODUCT GITHUB PRODUCT HUNT

The post is a practical reality check from a local-video beginner: Wan2.2 can produce decent short clips, but the user is running into the current ceiling of open video models rather than just operator error. As of now, open/local systems are genuinely useful for image-to-video, low-motion stylized shots, and stitched micro-sequences, but they still lag private frontier models like Sora 2 and Veo 3.1 on realism, physics, prompt adherence, and especially native audio.

// ANALYSIS

Hot take: open video gen is no longer a toy, but it is still an assistant for making fragments, not a replacement for a top private model when you need polished, coherent scenes.

–Wan2.2 is a meaningful open release: the repo documents T2V/I2V/TI2V models, 720p support, 24 fps for TI2V, and ComfyUI/Diffusers integration, so the ecosystem is real and improving.
–The user’s 6-10 second ceiling is normal; short duration is still where most local models are strongest, especially when you want something you can actually steer.
–Prompt extension matters: Wan’s own docs recommend it, and in practice it helps more than adding extra adjectives manually.
–Best local use cases today are:
–stylized b-roll
–image-to-video motion from a strong keyframe
–character animation/replacement
–stitched montage-style sequences with repeated look and camera language
–What frontier private models still do better:
–longer temporal coherence
–better physical plausibility
–stronger prompt following
–cleaner motion, faces, hands, and camera transitions
–native audio and lip sync
–What is feasible right now locally:
–5 to 10 second shots with a controlled look
–storyboards built from multiple clips
–“history video” style content if you cut around drift and use recurring references
–What is still out of reach:
–truly long, single-take narratives with stable identity
–dialogue-heavy scenes with convincing synced speech
–complex multi-character blocking without drift
–consistent scene-to-scene continuity without human editing help

// TAGS

video generationwan2-2local aiopen sourceimage-to-videovideo-gencomfyuillm

DISCOVERED

48d ago

2026-04-09

PUBLISHED

48d ago

2026-04-09

RELEVANCE

9/ 10

AUTHOR

val_in_tech

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE36m ago

Claude Code 2.1.154 teases CLI fixes

The Claude Code X account says version 2.1.154 is about to be released, signaling another small maintenance update in Anthropic’s fast-moving CLI cadence. Recent Claude Code releases have focused on reliability, model-picker fixes, MCP handling, background-session polish, and other workflow rough edges, so this looks like a refinement patch rather than a major feature milestone.

MODEL40m ago

ElevenLabs Dubbing v2 keeps emotion intact

ElevenLabs says Dubbing v2 carries over the original performance, not just the transcript, across 90+ languages. The pitch is sync-aware phrasing and delivery that sounds acted, not machine-translated, for creators, marketers, and production teams.

MODEL1h ago

Gemini 3.5 Flash powers Archon UI design

Google's latest 3.5 Flash model integrates with the Archon coding harness to deliver high-fidelity frontend designs via specialized agentic workflows. The model features a 1M context window and optimized reasoning for autonomous, multi-step development tasks.