今天,Stability AI宣布推出SDXL 0.9,这是稳定扩散文本到图像模型套件中最先进的开发。 继 0 月成功发布稳定扩散 XL 测试版后,SDXL 9.<> 的图像和构图细节比其前身有了显著改进。
该模型今天可以通过ClipDrop访问,API即将推出。 研究权重现已推出,随着我们转向 1.0,将于 <> 月中旬公开发布。
尽管 SDXL 0.9 能够在现代消费者 GPU 上运行,但它在生成式 AI 图像的创造性用例方面实现了飞跃。 SDXL 能够为电影、电视、音乐和教学视频生成超逼真的创作,并为设计和工业用途提供改进,这使 SDXL 处于 AI 图像真实世界应用的前沿。
例子
在 SDXL beta(左)和 0.9 上测试的一些提示示例显示了该模型在短短两个月内取得了多大进展。
The SDXL series also offers a range of functionalities that extend beyond basic text prompting. These include image-to-image prompting (inputting one image to get variations of that image), inpainting (reconstructing missing parts of an image), and outpainting (constructing a seamless extension of an existing image).
What’s under the hood?
The key driver of this advancement in composition for SDXL 0.9 is its significant increase in parameter count (the sum of all the weights and biases in the neural network that the model is trained on) over the beta version.
SDXL 0.9 has one of the largest parameter counts of any open source image model, boasting a 3.5B parameter base model and a 6.6B parameter model ensemble pipeline (the final output is created by running on two models and aggregating the results). The second stage model of the pipeline is used to add finer details to the generated output of the first stage.
To compare, the beta version runs on 3.1B parameters and uses just a single model.
SDXL 0.9 is run on two CLIP models, including one of the largest OpenCLIP models trained to date (OpenCLIP ViT-G/14), which beefs up 0.9’s processing power and ability to create realistic imagery with greater depth and a higher resolution of 1024×1024.
A research blog going into greater detail about the specifications and testing of this model will be released by the SDXL team shortly.
System requirements
Despite its powerful output and advanced model architecture, SDXL 0.9 is able to be run on a modern consumer GPU, needing only a Windows 10 or 11, or Linux operating system, with 16GB RAM, an Nvidia GeForce RTX 20 graphics card (equivalent or higher standard) equipped with a minimum of 8GB of VRAM. Linux users are also able to use a compatible AMD card with 16GB VRAM.
Beta launch statistics
Since SDXL’s beta launch on April 13, we’ve had great responses from our Discord community of users numbering nearly 7,000. These users have generated more than 700,000 images, averaging more than 20,000 per day. More than 54,000 images have been entered into Discord community ‘Showdowns’ with 3,521 SDXL images nominated as winners.
Availability
SDXL 0.9 is now available on the Clipdrop by Stability AI platform. Stability AI API and DreamStudio customers will be able to access the model this Monday, 26th June as well as other leading image generating tools like NightCafe.
SDXL 0.9 will be provided for research purposes only during a limited period to collect feedback and fully refine the model before its general open release. The code to run it will be publicly available on Github.
If researchers would like to access these models, please apply using the following link: SDXL-0.9-Base model, and SDXL-0.9-Refiner. Please log in to your HuggingFace Account with your academic email to request access. Kindly remember that currently, SDXL 0.9 is exclusively intended for research purposes.
What’s next?
SDXL 0.9 will be followed by the full open release of SDXL 1.0 targeted for mid-July (timing TBC).
License
SDXL0.9 is released under a non-commercial, research-only license and is subject to its terms of use.
Contact
For further information or to provide feedback on SDXL 0.9, we welcome you to contact us at research@stability.ai.