DomainShuttle enables subject-consistent video generation
DomainShuttle is a Wan-2.2-based text-to-video framework designed to maintain subject consistency across diverse styles and domains. It utilizes a Domain Mixture-of-Transformers and a DualRoPE spatial scheme to decouple identity features from background style.
DomainShuttle solves the classic trade-off between style adherence and identity preservation by cleanly separating the spatial embedding spaces of the reference image and the target video.
* The Video-Reference DualRoPE is a clever architecture change that prevents reference image tokens from contaminating the spatial structure of the generated video.
* Leveraging the Wan-2.2 foundation model gives the project a strong baseline, making it much more practical than training a subject-driven model from scratch.
* While the results look promising, actual performance remains heavily dependent on the quality and orientation of the input reference image.
DISCOVERED
1h ago
2026-06-28
PUBLISHED
1h ago
2026-06-28
RELEVANCE
AUTHOR
AI Search