Today, Tongyi Qianwen officially released Qwen-Image-Edit, an upgraded version of its 20B Qwen-Image model. For the first time, it extends its powerful text rendering capabilities to image editing. Users can experience this innovative tool through the "Image Edit" feature in Qwen Chat.
Qwen-Image-Edit's core advantage lies in its dual editing capabilities: by combining the visual semantics control of Qwen 2.5-VL with the appearance control of the VAE Encoder, users can not only precisely modify text in images (in both Chinese and English), but also perform a variety of operations, from adding and removing low-level elements to high-level style transfer. For example, it can generate different MBTI emojis for the mascot Capybara, rotate an object 90 degrees to show the back, and even transform a portrait into a Ghibli animation style.
In practical applications, the model performs exceptionally well. Whether adding signage and automatically generating reflections, or removing minor imperfections like strands of hair, Qwen-Image-Edit preserves the rest of the image unchanged. It also supports chained editing, such as gradually correcting typos in a calligraphy work, ultimately generating an accurate version of the Lanting Preface. Official tests have shown that the model has reached industry-leading levels in multiple benchmark tasks, providing an efficient tool for design, advertising, and content creation.