Details
- Alibaba Cloud’s Qwen project has unveiled Qwen-Image-Layered, a diffusion model designed to generate multi-layer RGBA files instead of simple flat bitmaps.
- This model breaks down each generated image into 3 to 10 separate layers—such as background, mid-ground, and standout objects—each equipped with its own alpha channel for seamless editing in applications like Photoshop, Figma, and After Effects.
- Users can specify the quantity, sequence, and descriptions of layers directly through prompts, making complex scene creation easier without manual masking steps.
- Qwen has open-sourced the model’s code, 1.8 billion-parameter checkpoints, training scripts, and a 50,000-image RGBA dataset under the Apache-2.0 license, supporting both commercial and research use.
- The model demonstrates a 93 percent Intersection-over-Union accuracy on the new LayerBench benchmark, achieving this while using 40 percent fewer GPU hours than a Stable Diffusion XL plus ControlNet setup.
- The release bundle features a web demo, Docker image, and a four-step inference guide, letting developers quickly deploy the model locally or in the cloud.
Impact
Qwen’s model raises the competitive stakes for major image generators, pushing them to adopt more editable, layered outputs. By automating layer separation, the tool could significantly cut workflow time and costs for design studios, leading to greater enterprise use of generative AI. Open-source accessibility positions Qwen to cater to China’s data policies and attract a broad developer and commercial community, challenging established players like Stability AI and OpenAI.
