
Wan2.2-I2V-Flash reportedly boasts a 12-fold increase in inference speed compared to Wan2.1, significantly improved command compliance, direct output of various special effects prompts, and precise camera control. Furthermore, it maintains the style of various image styles and achieves plausible and natural dynamic effects. Furthermore, it costs 0.1 yuan per second, and its card drawing success rate is 123% higher than Wan2.1. The model is currently available for trial use via API calls on Alibaba Cloud Bailian.
On July 28th, Alibaba open-sourced Tongyi Wanxiang Wan2.2, which includes three models: text-based video (Wan2.2-T2V-A14B), image-based video (Wan2.2-I2V-A14B), and unified video generation (Wan2.2-IT2V-5B). Both the text-based video model and the image-based video model are the industry's first video generation models to use the MoE architecture, with a total of 27B parameters and 14B active parameters. They also pioneered a film aesthetic control system, with capabilities in lighting, color, composition, and micro-expressions comparable to those of professional films.